Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Utf8 Laravel Package

patchwork/utf8

Portable UTF-8 and grapheme cluster handling for PHP. Provides pure-PHP fallbacks for mbstring, iconv, and intl Normalizer/grapheme_* functions plus UTF-8-aware replacements for native string functions, improving reliability across servers.

View on GitHub
Deep Wiki
Context7

Getting Started

Minimal Setup

  1. Installation Add the package via Composer:

    composer require patchwork/utf8
    

    No additional configuration is requiredβ€”it’s a drop-in library.

  2. First Use Case Check if a string contains a grapheme cluster (e.g., emojis with modifiers):

    use Patchwork\UTF8;
    
    $text = "Hello πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦";
    $graphemes = UTF8::str_split($text);
    
    foreach ($graphemes as $grapheme) {
        echo "Grapheme: {$grapheme}\n";
    }
    

    Outputs each "visual character" (e.g., the family emoji as one unit).


Implementation Patterns

Core Workflows

  1. String Manipulation

    • Split/Join Graphemes (emoji-safe):
      $parts = UTF8::str_split($text);
      $rejoined = UTF8::join($parts);
      
    • Case Conversion (locale-aware):
      $lower = UTF8::strtolower($text, 'en_US');
      
  2. Character Inspection

    • Check if a string is ASCII:
      if (UTF8::is_ascii($text)) { ... }
      
    • Validate Unicode ranges (e.g., CJK characters):
      if (UTF8::in_range($char, 0x4E00, 0x9FFF)) { ... }
      
  3. Normalization

    • Normalize strings (e.g., Γ© vs é):
      $normalized = UTF8::normalize($text, UTF8::NFD);
      

Laravel Integration

  • Form Request Validation Sanitize user input to handle graphemes:

    use Patchwork\UTF8;
    
    $validated = $request->validate([
        'username' => 'string|max:255',
    ]);
    $safeUsername = UTF8::strtolower($validated['username']);
    
  • Database/ORM Ensure UTF-8 compatibility in migrations:

    Schema::create('users', function (Blueprint $table) {
        $table->string('name')->charset('utf8mb4'); // Supports 4-byte UTF-8
    });
    
  • Blade Templates Safely render user-generated content:

    <div>{{ UTF8::htmlentities($userComment) }}</div>
    

Gotchas and Tips

Pitfalls

  1. Emoji Handling

    • Issue: strlen() and mb_strlen() may miscount graphemes (e.g., flags like πŸ‡ΊπŸ‡Έ are 2 codepoints but 1 visual character).
    • Fix: Use UTF8::strlen() or UTF8::str_split() for accurate counts.
  2. Performance

    • Issue: Heavy operations (e.g., UTF8::normalize()) on large strings can be slow.
    • Fix: Cache normalized strings or defer processing until necessary.
  3. Locale Dependence

    • Issue: Case conversion (strtolower) depends on the locale (e.g., ß β†’ ss in German).
    • Fix: Explicitly specify locales:
      UTF8::strtolower($text, 'de_DE');
      
  4. UTF-8 BOM

    • Issue: Files with a BOM (e.g., from Windows) may cause encoding errors.
    • Fix: Strip BOMs early:
      $content = UTF8::strip_bom(file_get_contents('file.txt'));
      

Debugging Tips

  • Inspect Graphemes:
    var_dump(UTF8::str_split($text)); // Visualize grapheme clusters.
    
  • Validate UTF-8:
    if (!UTF8::is_valid($text)) {
        throw new \InvalidArgumentException('Invalid UTF-8');
    }
    

Extension Points

  1. Custom Normalization Extend the library by wrapping its methods:

    function customNormalize($text) {
        return UTF8::normalize(UTF8::strtolower($text), UTF8::NFC);
    }
    
  2. Integration with Laravel Services Bind the library to the container for global access:

    $this->app->singleton('utf8', function () {
        return new class {
            public function split($text) {
                return UTF8::str_split($text);
            }
        };
    });
    
  3. Testing Use UTF8::str_split() to test edge cases (e.g., surrogate pairs, combining marks):

    $this->assertCount(1, UTF8::str_split("πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦")); // Family emoji.
    
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
davejamesmiller/laravel-breadcrumbs
artisanry/parsedown
christhompsontldr/phpsdk
enqueue/dsn
bunny/bunny
enqueue/test
enqueue/null
enqueue/amqp-tools
milesj/emojibase
bower-asset/punycode
bower-asset/inputmask
bower-asset/jquery
bower-asset/yii2-pjax
laravel/nova
spatie/laravel-mailcoach
spatie/laravel-superseeder
laravel/liferaft
nst/json-test-suite
danielmiessler/sec-lists
jackalope/jackalope-transport