typo3/html-sanitizer
Standalone PHP HTML sanitizer from TYPO3. Define sanitizing rules via Behavior, apply multiple Visitors, and run through a Sanitizer built from reusable presets. Supports safe tag/attribute allowlists, value validation (e.g., regex), and encoding or removing invalid nodes.
composer require typo3/html-sanitizer
use TYPO3\HtmlSanitizer\Sanitizer;
use TYPO3\HtmlSanitizer\Builder\CommonBuilder;
$sanitizer = (new CommonBuilder())->build();
$cleanHtml = $sanitizer->sanitize('<div>Unsafe <script>alert("XSS")</script> content</div>');
Sanitize HTML from user inputs (comments, posts, etc.) to prevent XSS:
$sanitizer = (new CommonBuilder())
->allowTags('p', 'b', 'i', 'a')
->allowAttributes('a')->addValues('href')->withRegex('#^https?://#')
->build();
$userInput = '<p>Hello <b>World</b>! <a href="javascript:alert(1)">Click</a></p>';
echo $sanitizer->sanitize($userInput);
// Output: <p>Hello <b>World</b>! <a>Click</a></p>
CommonBuilder: Predefined safe HTML rules (good for quick starts).Sanitizer: Core class for sanitizing HTML.Behavior: Define custom rules for tags, attributes, and values.Define a sanitizer for a blog platform allowing only specific tags/attributes:
use TYPO3\HtmlSanitizer\Behavior;
use TYPO3\HtmlSanitizer\Visitor\CommonVisitor;
$behavior = (new Behavior())
->withTags(
(new Behavior\Tag('h1'))->allowChildren(),
(new Behavior\Tag('p'))->allowChildren(),
(new Behavior\Tag('a', Behavior\Tag::ALLOW_CHILDREN))
->addAttrs(
(new Behavior\Attr('href'))
->addValues(new Behavior\RegExpAttrValue('#^https?://#'))
)
);
$sanitizer = new Sanitizer($behavior, new CommonVisitor($behavior));
Extend rules based on user roles (e.g., admins get more tags):
$adminBehavior = $baseBehavior->withTags(
(new Behavior\Tag('img'))
->addAttrs(
(new Behavior\Attr('src'))->addValues(new Behavior\RegExpAttrValue('#^https?://#')),
(new Behavior\Attr('alt'))
)
);
Transform or replace specific nodes (e.g., convert <typo3> tags to text):
$behavior->withNodes(
new Behavior\NodeHandler(
new Behavior\Tag('typo3'),
new Behavior\Handler\ClosureHandler(
fn($node, $domNode) => new DOMText('[' . $domNode->textContent . ']')
)
)
);
Create reusable builders for different contexts (e.g., CommentBuilder, RichTextBuilder):
class CommentBuilder implements BuilderInterface {
public function build(): Sanitizer {
$behavior = (new Behavior())
->withTags(
(new Behavior\Tag('p'))->allowChildren(),
(new Behavior\Tag('b'))->allowChildren(),
(new Behavior\Tag('i'))->allowChildren()
);
return new Sanitizer($behavior, new CommonVisitor($behavior));
}
}
Store sanitized HTML in the database or cache:
// In a Laravel controller
public function store(Request $request) {
$sanitizer = app()->make(Sanitizer::class);
$cleanHtml = $sanitizer->sanitize($request->input('content'));
// Save $cleanHtml to DB
}
Register the sanitizer as a singleton:
// app/Providers/AppServiceProvider.php
public function register() {
$this->app->singleton(Sanitizer::class, function() {
return (new CommonBuilder())->build();
});
}
Extend Laravel validation to sanitize inputs:
// app/Http/Requests/SanitizeRequest.php
public function passedValidation() {
$sanitizer = app(Sanitizer::class);
$this->merge([
'content' => $sanitizer->sanitize($this->content),
]);
}
Create a Blade directive for sanitized output:
// app/Providers/BladeServiceProvider.php
Blade::directive('sanitize', function($expression) {
return "<?php echo app(\TYPO3\HtmlSanitizer\Sanitizer::class)->sanitize({$expression}); ?>";
});
Usage:
{{-- In a Blade template --}}
<div>@sanitize($userInput)</div>
Behavior objects are immutable. Always chain methods (e.g., withTags()) to create new instances:
// Wrong: Modifies nothing
$behavior->withTags(new Behavior\Tag('div'));
// Correct: Creates a new instance
$behavior = $behavior->withTags(new Behavior\Tag('div'));
// Allows only HTTP/HTTPS URLs
$hrefAttr = (new Behavior\Attr('href'))
->addValues(new Behavior\RegExpAttrValue('#^https?://#'));
ALLOW_CUSTOM_ELEMENTS (default: false) blocks tags like <my-custom>:
$behavior->withFlags(Behavior::ALLOW_CUSTOM_ELEMENTS);
ENCODE_INVALID_TAG convert unsafe tags to HTML entities (e.g., <script> → <script>). Test this behavior with your use case.DOMNode objects are from the same document to avoid errors:
$newNode = new DOMText($domNode->textContent);
$domNode->parentNode->replaceChild($newNode, $domNode);
$dirtyHtml = '<div>Test <script>alert(1)</script></div>';
$cleanHtml = $sanitizer->sanitize($dirtyHtml);
\Log::debug("Input: $dirtyHtml");
\Log::debug("Output: $cleanHtml");
<div></div><img src="x"><div><script><p>Test</p></script></div><div>Café</div>VisitorInterface to add preprocessing/postprocessing:
class MyVisitor implements VisitorInterface {
public function visit(NodeInterface $node, DOMNode $domNode): void {
if ($domNode->nodeName === 'a' && $domNode->hasAttribute('href')) {
$domNode->setAttribute('rel', 'nofollow');
}
}
}
$config = json_decode(file_get_contents('config/sanitizer.json'), true);
$behavior = (new Behavior())
->withTags(...array_map(fn($tag) => new Behavior\Tag($tag), $config['tags']));
Sanitizer instances (they are stateless):
$sanitizer = (new CommonBuilder())->build();
foreach ($userInputs as $input) {
$clean = $sanitizer->sanitize($input); // Reuse instance
}
$serializer = new \TYPO3\HtmlSanitizer\Serializer\XmlSerializer();
$sanitizer = new Sanitizer($behavior, new CommonVisitor($behavior), $serializer
How can I help you explore Laravel packages today?