Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Stemmer Laravel Package

dompat/stemmer

Strictly typed PHP 8.3+ stemming library for better full-text search, indexing, and text analysis. Supports LIGHT and AGGRESSIVE modes, with extensible language drivers. Includes Czech and English out of the box; add custom locales easily.

View on GitHub
Deep Wiki
Context7

Getting Started

Install via Composer:

composer require dompat/stemmer

First Use Case: Normalize search queries or index content for full-text search.

use Dompat\Stemmer\Stemmer;
use Dompat\Stemmer\Driver\EnglishDriver;
use Dompat\Stemmer\Enum\StemmerMode;

$stemmer = new Stemmer([new EnglishDriver('en')]);
echo $stemmer->stem('running', 'en', StemmerMode::AGGRESSIVE); // Output: "run"

Where to Look First:

  • Stemmer class for core functionality.
  • DriverInterface for extending to new languages.
  • StemmerMode enum for mode selection.

Implementation Patterns

1. Laravel Scout Integration

Pre-process searchable attributes before indexing:

// app/Models/Post.php
use Dompat\Stemmer\Facades\Stemmer;

public function toSearchableArray()
{
    return [
        'title' => Stemmer::stem($this->title, 'en', StemmerMode::AGGRESSIVE),
        'body'  => Stemmer::stem($this->body, 'en', StemmerMode::LIGHT),
    ];
}

2. Query Normalization

Normalize user input for search queries:

// app/Http/Controllers/SearchController.php
use Dompat\Stemmer\Facades\Stemmer;

public function search(Request $request)
{
    $stemmedQuery = Stemmer::stem($request->query, 'en');
    return Post::where('title', 'like', "%{$stemmedQuery}%")->get();
}

3. Service Layer Abstraction

Centralize stemming logic in a service:

// app/Services/TextNormalizer.php
public function normalize(string $text, string $locale = 'en'): string
{
    return Stemmer::stem($text, $locale, StemmerMode::LIGHT);
}

4. Facade for Convenience

Create a facade to simplify usage:

// app/Facades/Stemmer.php
public static function stem(string $word, string $locale, string $mode = StemmerMode::LIGHT): string
{
    return app(Stemmer::class)->stem($word, $locale, $mode);
}

5. Custom Driver Extension

Extend for unsupported languages (e.g., German):

// app/Driver/GermanDriver.php
class GermanDriver implements DriverInterface
{
    public function stem(string $word, StemmerMode $mode): string
    {
        // Implement German stemming logic
    }
}

// Register in service provider
$stemmer->addDriver(new GermanDriver('de'));

Gotchas and Tips

Pitfalls

  1. PHP 8.3 Requirement:

    • Fails on older PHP versions. Use Docker or upgrade PHP.
    • Tip: Test in a staging environment with PHP 8.3 before production deployment.
  2. Locale Sensitivity:

    • Incorrect locale codes (e.g., 'cs' vs. 'sk') may return unexpected results.
    • Tip: Validate locales with array_key_exists() or a whitelist.
  3. Aggressive Mode Over-Stemming:

    • May reduce words to unrecognizable roots (e.g., 'happiness''happi').
    • Tip: Use LIGHT mode for user-facing text and AGGRESSIVE for indexing.
  4. Performance Overhead:

    • Stemming adds CPU cost. Benchmark with large datasets.
    • Tip: Cache frequent stems (e.g., Redis) if processing high volumes.
  5. No Fallback for Unsupported Locales:

    • Throws exceptions if the locale isn’t registered.
    • Tip: Add a fallback in the Stemmer class:
      public function stem(string $word, string $locale, StemmerMode $mode = StemmerMode::LIGHT): string
      {
          if (!$this->hasDriver($locale)) {
              return $word; // or log a warning
          }
          return $this->drivers[$locale]->stem($word, $mode);
      }
      

Debugging Tips

  • Verify Output: Compare results with known stemming tools (e.g., Snowball).
  • Log Edge Cases: Log unexpected stems for review:
    if ($stemmed !== $expected) {
        \Log::warning("Stemming mismatch: {$word} -> {$stemmed} (expected: {$expected})");
    }
    
  • Test with Real Data: Use actual search queries or corpus text to validate accuracy.

Configuration Quirks

  • Driver Registration Order:

    • Drivers are matched by locale code. Ensure the correct driver is registered first.
    • Tip: Register drivers in a service provider’s boot() method for consistency.
  • Mode Selection:

    • LIGHT mode preserves readability (e.g., 'working''work').
    • AGGRESSIVE mode maximizes recall (e.g., 'happiness''happi').
    • Tip: Document the chosen mode in your codebase to avoid confusion.

Extension Points

  1. Add New Languages:

    • Implement DriverInterface and register the driver with the Stemmer instance.
  2. Custom Stemming Rules:

    • Override the stem() method in a custom driver for language-specific tweaks.
  3. Integration with Laravel Events:

    • Trigger stemming on model events (e.g., saved):
      // app/Models/Post.php
      protected static function booted()
      {
          static::saved(function ($post) {
              $post->update(['stemmed_title' => Stemmer::stem($post->title, 'en')]);
          });
      }
      
  4. Middleware for API Requests:

    • Normalize incoming search queries:
      // app/Http/Middleware/StemSearchQueries.php
      public function handle(Request $request, Closure $next)
      {
          if ($request->has('q')) {
              $request->merge(['stemmed_q' => Stemmer::stem($request->q, 'en')]);
          }
          return $next($request);
      }
      
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
sayedenam/sayed-dashboard
milito/query-filter
apiboxsym/user-bundle
apiboxsym/health-check-bundle
jayeshmepani/jpl-moshier-ephemeris-php
elnasnato/laraliveui
labrodev/rest-sdk
sampaui/sampaui
babelqueue/php-sdk
facebook/capi-param-builder-php
babelqueue/symfony
hamzi/corewatch
minionfactory/raw-hydrator
hexters/coinpayment
rjcodes/rjcms
act-training/laravel-permissions-manager
alimarchal/laravel-chart-of-accounts
babenkoivan/elastic-scout-driver
mkwebdesign/filament-watchdog-v5
renatomarinho/laravel-page-speed