dompat/stemmer
Strictly typed PHP 8.3+ stemming library for better full-text search, indexing, and text analysis. Supports LIGHT and AGGRESSIVE modes, with extensible language drivers. Includes Czech and English out of the box; add custom locales easily.
Install via Composer:
composer require dompat/stemmer
First Use Case: Normalize search queries or index content for full-text search.
use Dompat\Stemmer\Stemmer;
use Dompat\Stemmer\Driver\EnglishDriver;
use Dompat\Stemmer\Enum\StemmerMode;
$stemmer = new Stemmer([new EnglishDriver('en')]);
echo $stemmer->stem('running', 'en', StemmerMode::AGGRESSIVE); // Output: "run"
Where to Look First:
Stemmer class for core functionality.DriverInterface for extending to new languages.StemmerMode enum for mode selection.Pre-process searchable attributes before indexing:
// app/Models/Post.php
use Dompat\Stemmer\Facades\Stemmer;
public function toSearchableArray()
{
return [
'title' => Stemmer::stem($this->title, 'en', StemmerMode::AGGRESSIVE),
'body' => Stemmer::stem($this->body, 'en', StemmerMode::LIGHT),
];
}
Normalize user input for search queries:
// app/Http/Controllers/SearchController.php
use Dompat\Stemmer\Facades\Stemmer;
public function search(Request $request)
{
$stemmedQuery = Stemmer::stem($request->query, 'en');
return Post::where('title', 'like', "%{$stemmedQuery}%")->get();
}
Centralize stemming logic in a service:
// app/Services/TextNormalizer.php
public function normalize(string $text, string $locale = 'en'): string
{
return Stemmer::stem($text, $locale, StemmerMode::LIGHT);
}
Create a facade to simplify usage:
// app/Facades/Stemmer.php
public static function stem(string $word, string $locale, string $mode = StemmerMode::LIGHT): string
{
return app(Stemmer::class)->stem($word, $locale, $mode);
}
Extend for unsupported languages (e.g., German):
// app/Driver/GermanDriver.php
class GermanDriver implements DriverInterface
{
public function stem(string $word, StemmerMode $mode): string
{
// Implement German stemming logic
}
}
// Register in service provider
$stemmer->addDriver(new GermanDriver('de'));
PHP 8.3 Requirement:
Locale Sensitivity:
'cs' vs. 'sk') may return unexpected results.array_key_exists() or a whitelist.Aggressive Mode Over-Stemming:
'happiness' → 'happi').LIGHT mode for user-facing text and AGGRESSIVE for indexing.Performance Overhead:
No Fallback for Unsupported Locales:
Stemmer class:
public function stem(string $word, string $locale, StemmerMode $mode = StemmerMode::LIGHT): string
{
if (!$this->hasDriver($locale)) {
return $word; // or log a warning
}
return $this->drivers[$locale]->stem($word, $mode);
}
if ($stemmed !== $expected) {
\Log::warning("Stemming mismatch: {$word} -> {$stemmed} (expected: {$expected})");
}
Driver Registration Order:
boot() method for consistency.Mode Selection:
LIGHT mode preserves readability (e.g., 'working' → 'work').AGGRESSIVE mode maximizes recall (e.g., 'happiness' → 'happi').Add New Languages:
DriverInterface and register the driver with the Stemmer instance.Custom Stemming Rules:
stem() method in a custom driver for language-specific tweaks.Integration with Laravel Events:
saved):
// app/Models/Post.php
protected static function booted()
{
static::saved(function ($post) {
$post->update(['stemmed_title' => Stemmer::stem($post->title, 'en')]);
});
}
Middleware for API Requests:
// app/Http/Middleware/StemSearchQueries.php
public function handle(Request $request, Closure $next)
{
if ($request->has('q')) {
$request->merge(['stemmed_q' => Stemmer::stem($request->q, 'en')]);
}
return $next($request);
}
How can I help you explore Laravel packages today?