stevebauman/purify
Laravel wrapper for HTMLPurifier to sanitize user HTML safely. Clean strings or arrays via the Purify facade, with optional per-call configuration. Publish a config file, tune allowed tags/attributes, and leverage caching for performance.
To begin using stevebauman/purify in a Laravel project, follow these minimal steps:
Installation:
composer require stevebauman/purify
php artisan vendor:publish --provider="Stevebauman\Purify\PurifyServiceProvider"
This installs the package and publishes the default configuration file (config/purify.php).
First Use Case: Clean a user-submitted HTML string to remove malicious content:
use Stevebauman\Purify\Facades\Purify;
$dirtyHtml = '<script>alert("XSS");</script><p>Hello</p>';
$cleanHtml = Purify::clean($dirtyHtml); // Returns: '<p>Hello</p>'
Key Files to Review:
config/purify.php: Default configurations and custom rules.app/Providers/AppServiceProvider.php: Register custom definitions if needed.app/Models/Post.php: Example of using PurifyHtmlOnGet cast for Eloquent models.Basic Sanitization:
Use Purify::clean($input) for one-off sanitization tasks (e.g., form submissions, API payloads).
Example:
$cleaned = Purify::clean(request('content'));
Batch Processing: Sanitize arrays of HTML strings (e.g., bulk imports, comment sections):
$cleanedComments = Purify::clean($commentsArray);
Dynamic Configurations: Override default rules for specific use cases (e.g., admin vs. user-generated content):
$cleaned = Purify::config(['HTML.Allowed' => 'div,p,a[href]'])->clean($input);
Eloquent Integration:
Use PurifyHtmlOnGet cast to sanitize HTML on retrieval (recommended for security):
// Laravel 11+
protected function casts(): array
{
return [
'content' => PurifyHtmlOnGet::class,
];
}
API Responses: Sanitize HTML before returning it in API responses:
return response()->json(['content' => Purify::clean($model->content)]);
Form Requests:
Extend FormRequest to auto-sanitize inputs:
public function rules()
{
return [
'content' => 'required|string',
];
}
public function passedValidation()
{
$this->merge([
'content' => Purify::clean($this->content),
]);
return parent::passedValidation();
}
Middleware: Sanitize HTML in middleware for global protection:
public function handle($request, Closure $next)
{
$request->merge([
'content' => Purify::clean($request->content),
]);
return $next($request);
}
Livewire/Alpine: Sanitize user-generated content in frontend interactions:
// Alpine.js
document.addEventListener('alpine:init', () => {
Alpine.data('editor', () => ({
content: '',
sanitize() {
this.content = await axios.post('/sanitize', { html: this.content });
}
}));
});
Queue Jobs: Defer sanitization to background jobs for performance:
SanitizeJob::dispatch($userInput)->onQueue('purify');
Cache Invalidation:
Forgetting to run php artisan purify:clear after updating definitions or configs causes stale rules to persist.
purify:clear in a post-update hook.Overly Permissive Rules:
Allowing unrestricted attributes (e.g., a[*]) can reintroduce XSS risks.
'HTML.Allowed' => 'a[href|title|rel]',
CSS Property Gaps:
Default CSS definitions may lack modern values (e.g., text-align: start/end).
CssDefinition:
$definition->info['text-align'] = new \HTMLPurifier_AttrDef_Enum(
['start', 'end', 'left', 'right', 'center'],
false
);
Nested Configurations: Dynamic configs override all settings, not merge them.
config/purify.php for reuse:
'configs' => [
'comments' => ['HTML.Allowed' => 'p,b,i,a[href]'],
],
// Usage:
Purify::config('comments')->clean($input);
Performance Spikes:
Disabling caching (serializer: null) causes slowdowns in production.
'serializer' => storage_path('app/purify-cache'),
Inspect Purified Output: Log the cleaned HTML to verify rules:
\Log::debug('Cleaned HTML:', ['output' => Purify::clean($input)]);
Validate Configs: Use HTMLPurifier’s live config doc to test rules: http://htmlpurifier.org/live/configdoc/plain.html
Check Definitions: For custom elements/attributes, validate against the HTMLPurifier schema.
Error Handling:
Wrap Purify::clean() in a try-catch to handle malformed HTML:
try {
$cleaned = Purify::clean($input);
} catch (\HTMLPurifier_Exception $e) {
\Log::error('Purify error:', ['input' => $input, 'error' => $e->getMessage()]);
$cleaned = ''; // Fallback
}
Custom Definitions:
Extend Html5Definition for modern elements (e.g., <dialog>, <slot>):
class ExtendedHtml5Definition implements Definition {
public static function apply($definition) {
Html5Definition::apply($definition);
$definition->addElement('dialog', 'Block', 'Flow', 'Common');
}
}
Attribute Whitelisting: Dynamically allow attributes based on context:
$config = [
'HTML.AllowedAttributes' => [
'a' => ['href', 'title', 'rel', 'data-track' => true],
],
];
Purify::config($config)->clean($input);
URI Filtering: Restrict allowed URLs (e.g., block external links):
'URI.AllowedSchemes' => ['http', 'https', 'mailto'],
Image Sanitization: Validate image sources (e.g., allow only CDN-hosted images):
'URI.Image' => [
'schemes' => ['https'],
'hosts' => ['cdn.example.com'],
],
Event Hooks:
Listen to purify.cleaning events to log or modify inputs:
Purify::cleaning(function ($input, $config) {
\Log::debug('Sanitizing input', ['input' => $input, 'config' => $config]);
});
How can I help you explore Laravel packages today?