Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Htmlpurifier Html5 Laravel Package

xemlock/htmlpurifier-html5

HTML5 definitions and tidy/sanitization rules for HTML Purifier, aligned with the WHATWG spec. Purify and normalize dirty HTML5 into valid output with an HTML5-ready config, plus flexible directives (e.g., safely allow YouTube iframes).

View on GitHub
Deep Wiki
Context7

Getting Started

Minimal Setup in Laravel

  1. Install the package:
    composer require xemlock/htmlpurifier-html5
    
  2. Basic usage in a Laravel controller/service:
    use HTMLPurifier;
    use HTMLPurifier_HTML5Config;
    
    public function sanitizeHtml(string $dirtyHtml): string
    {
        $config = HTMLPurifier_HTML5Config::createDefault();
        $purifier = new HTMLPurifier($config);
        return $purifier->purify($dirtyHtml);
    }
    
  3. First use case: Sanitize user-generated content (e.g., comments, CMS entries) while preserving HTML5 semantic elements like <article> or <section>:
    $cleanHtml = $this->sanitizeHtml($request->input('content'));
    

Where to Look First

  • Configuration reference: HTML Purifier Config Docs + package-specific directives.
  • Laravel integration: Use the purify() method in a service class or form request to centralize sanitization logic.
  • Testing: Validate edge cases (e.g., nested <figure> tags, datetime attributes) with PHPUnit:
    $this->assertStringContainsString('<figure>', $purifier->purify('<figure><img src="x"></figure>'));
    

Implementation Patterns

Workflows

1. Centralized Sanitization Service

Create a dedicated service to encapsulate purification logic:

namespace App\Services;

use HTMLPurifier;
use HTMLPurifier_HTML5Config;

class HtmlSanitizer
{
    protected $purifier;

    public function __construct()
    {
        $config = HTMLPurifier_HTML5Config::create([
            'HTML.Allowed' => 'article,section,header,footer,nav',
            'URI.Safe' => '%^(https?://)?example\.com%'
        ]);
        $this->purifier = new HTMLPurifier($config);
    }

    public function clean(string $html): string
    {
        return $this->purifier->purify($html);
    }
}

Usage in controllers:

$sanitizer = app(HtmlSanitizer::class);
$safeHtml = $sanitizer->clean($request->post('content'));

2. Dynamic Configuration per Context

Override defaults based on context (e.g., admin vs. user content):

// For admin posts (trusted content)
$adminConfig = HTMLPurifier_HTML5Config::createDefault();
$adminConfig->set('HTML.Trusted', true);

// For user comments (restricted)
$userConfig = HTMLPurifier_HTML5Config::createDefault();
$userConfig->set('HTML.Allowed', 'p,b,strong,a[href|title]');

3. Form Request Validation

Combine with Laravel’s validation to reject unsafe HTML before sanitization:

public function rules()
{
    return [
        'content' => [
            'required',
            function ($attribute, $value, $fail) {
                $purifier = new HTMLPurifier(HTMLPurifier_HTML5Config::createDefault());
                if ($purifier->purify($value) !== $value) {
                    $fail('HTML contains disallowed tags.');
                }
            }
        ]
    ];
}

4. Media Embeds (e.g., YouTube)

Whitelist specific iframes dynamically:

$config = HTMLPurifier_HTML5Config::createDefault();
$config->set('HTML.SafeIframe', true);
$config->set('URI.SafeIframeRegexp', '%^https?://(www\.)?youtube\.com/embed/%');

Integration Tips

  • Laravel Facades: Extend the Purifier facade to use HTML5Config:

    // app/Providers/AppServiceProvider.php
    use HTMLPurifier_HTML5Config;
    
    public function boot()
    {
        Purifier::extend(function ($app) {
            return new HTMLPurifier(HTMLPurifier_HTML5Config::createDefault());
        });
    }
    

    Now use Purifier::clean($html) as usual.

  • Caching: Reuse purifier instances (they’re thread-safe):

    $purifier = new HTMLPurifier(HTMLPurifier_HTML5Config::createDefault());
    // Reuse $purifier across requests
    
  • Testing: Use HTMLPurifierTestCase (from the package) or mock the purifier:

    $this->partialMock(HTMLPurifier::class, ['purify'])
         ->shouldReceive('purify')
         ->with($dirtyHtml)
         ->andReturn('<p>Cleaned</p>');
    

Gotchas and Tips

Pitfalls

  1. Broken <a> in v0.1.9:

    • Issue: Version 0.1.9 incorrectly treated <a> as a block-level element. Always use >=0.1.10.
    • Fix: Update via Composer:
      composer require xemlock/htmlpurifier-html5:^0.1.10
      
  2. Empty <figure> Removal:

    • Issue: Empty <figure> tags are stripped by default (pre-HTML5 behavior).
    • Fix: Ensure content exists or allow empty figures:
      $config->set('HTML.Figure', true); // Explicitly enable
      
  3. Form Security Risks:

    • Issue: Enabling HTML.Forms (default: false) allows phishing attacks via <form action="evil.com">.
    • Fix: Only enable for trusted contexts:
      $config->set('HTML.Forms', true);
      $config->set('HTML.Trusted', true); // Critical!
      
  4. Regex Performance:

    • Issue: Complex URI.Safe* regexes (e.g., SafeIframeRegexp) can slow down purification.
    • Fix: Pre-compile regexes or use simpler patterns:
      $config->set('URI.SafeIframeRegexp', '%^https://trusted\.com/embed/%');
      
  5. Attribute Conflicts:

    • Issue: HTML5 attributes (e.g., datetime on <time>) may conflict with custom definitions.
    • Fix: Use Attr.Allowed to explicitly permit attributes:
      $config->set('Attr.Allowed', ['time:datetime']);
      

Debugging Tips

  • Inspect Sanitized Output: Compare dirty vs. clean HTML to identify stripped elements:

    $dirty = '<div><script>alert(1)</script></div>';
    $clean = $purifier->purify($dirty);
    dd($dirty, $clean); // Check what was removed
    
  • Enable Debug Mode:

    $config->set('Debug', true);
    $config->set('Cache.SerializerPath', storage_path('app/htmlpurifier'));
    

    Logs will appear in storage/logs/laravel.log.

  • Validate Config: Use the HTMLPurifier_ConfigSchema to check for invalid directives:

    $schema = HTMLPurifier_ConfigSchema::instance();
    if (!$schema->validate($config->getAll())) {
        throw new \InvalidArgumentException('Invalid config');
    }
    

Extension Points

  1. Custom Elements/Attributes: Extend the config dynamically:

    $config->set('HTML.Allowed', 'article,section,my-custom-element');
    $config->set('Attr.Allowed', ['my-custom-element:data-custom']);
    
  2. Post-Processing: Use Laravel’s Str::of() to manipulate sanitized output:

    $cleanHtml = Str::of($purifier->purify($html))
        ->replace('class="old"', 'class="new"');
    
  3. Event Listeners: Hook into purification via Laravel events (e.g., illuminate.query for database sanitization):

    Event::listen('illuminate.query', function ($query) {
        if ($query->getQuery()->whereRaw) {
            $query->whereRaw('sanitized_column = ?', [$purifier->purify($query->getBindings()['sanitized_column'])]);
        }
    });
    
  4. Middleware: Sanitize input/output globally:

    class SanitizeHtmlMiddleware
    {
        public function handle($request, Closure $next)
        {
            $request->merge([
                'content' => $this->purifier->purify($request->input('content'))
            ]);
            return $next($request);
        }
    }
    

Configuration Quirks

  • **`HTML.Trusted
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
emuniq/filament-browser-notifications
syriable/filament-translator
hungnm28/livewire-form
wenprise/eloquent
crudly/encrypted
fadion/bouncy
cuci/prototurk-sdk
gos/pubsub-router-bundle
cuci/prototurk-sdk-symfony
clementtalleu/easyadmin-markdown-bundle
codeflextech/permission-manager
karnoweb/livewire-datepicker
sayedenam/sayed-dashboard
milito/query-filter
apiboxsym/user-bundle
apiboxsym/health-check-bundle
jayeshmepani/jpl-moshier-ephemeris-php
elnasnato/laraliveui
labrodev/rest-sdk
sampaui/sampaui