patchwork/utf8
Portable UTF-8 and grapheme cluster handling for PHP. Provides pure-PHP fallbacks for mbstring, iconv, and intl Normalizer/grapheme_* functions plus UTF-8-aware replacements for native string functions, improving reliability across servers.
Installation Add the package via Composer:
composer require patchwork/utf8
No additional configuration is requiredβitβs a drop-in library.
First Use Case Check if a string contains a grapheme cluster (e.g., emojis with modifiers):
use Patchwork\UTF8;
$text = "Hello π¨βπ©βπ§βπ¦";
$graphemes = UTF8::str_split($text);
foreach ($graphemes as $grapheme) {
echo "Grapheme: {$grapheme}\n";
}
Outputs each "visual character" (e.g., the family emoji as one unit).
String Manipulation
$parts = UTF8::str_split($text);
$rejoined = UTF8::join($parts);
$lower = UTF8::strtolower($text, 'en_US');
Character Inspection
if (UTF8::is_ascii($text)) { ... }
if (UTF8::in_range($char, 0x4E00, 0x9FFF)) { ... }
Normalization
Γ© vs eΜ):
$normalized = UTF8::normalize($text, UTF8::NFD);
Form Request Validation Sanitize user input to handle graphemes:
use Patchwork\UTF8;
$validated = $request->validate([
'username' => 'string|max:255',
]);
$safeUsername = UTF8::strtolower($validated['username']);
Database/ORM Ensure UTF-8 compatibility in migrations:
Schema::create('users', function (Blueprint $table) {
$table->string('name')->charset('utf8mb4'); // Supports 4-byte UTF-8
});
Blade Templates Safely render user-generated content:
<div>{{ UTF8::htmlentities($userComment) }}</div>
Emoji Handling
strlen() and mb_strlen() may miscount graphemes (e.g., flags like πΊπΈ are 2 codepoints but 1 visual character).UTF8::strlen() or UTF8::str_split() for accurate counts.Performance
UTF8::normalize()) on large strings can be slow.Locale Dependence
strtolower) depends on the locale (e.g., Γ β ss in German).UTF8::strtolower($text, 'de_DE');
UTF-8 BOM
$content = UTF8::strip_bom(file_get_contents('file.txt'));
var_dump(UTF8::str_split($text)); // Visualize grapheme clusters.
if (!UTF8::is_valid($text)) {
throw new \InvalidArgumentException('Invalid UTF-8');
}
Custom Normalization Extend the library by wrapping its methods:
function customNormalize($text) {
return UTF8::normalize(UTF8::strtolower($text), UTF8::NFC);
}
Integration with Laravel Services Bind the library to the container for global access:
$this->app->singleton('utf8', function () {
return new class {
public function split($text) {
return UTF8::str_split($text);
}
};
});
Testing
Use UTF8::str_split() to test edge cases (e.g., surrogate pairs, combining marks):
$this->assertCount(1, UTF8::str_split("π¨βπ©βπ§βπ¦")); // Family emoji.
How can I help you explore Laravel packages today?