jfcherng/php-sequence-matcher
PHP 8.4+ longest sequence matcher inspired by Python difflib. Compare strings or arrays to find matching blocks and similarities for diffing, change detection, and text analysis. Lightweight library extracted from php-diff with improvements.
Install the package via Composer:
composer require jfcherng/php-sequence-matcher
First Use Case: Compare two strings or arrays to find similarity ratio (0-1):
use Jfcherng\SequenceMatcher\SequenceMatcher;
$matcher = new SequenceMatcher();
$ratio = $matcher->ratio("Laravel", "Laravul"); // Returns ~0.857
Key Methods to Explore:
ratio(): Returns similarity ratio (0 = completely different, 1 = identical).getOpCodes(): Returns operation codes for detailed diff analysis.findLongestMatch(): Finds the longest contiguous matching subsequence.Where to Look First:
$matcher = new SequenceMatcher();
$similarity = $matcher->ratio($stringA, $stringB);
Laravel Integration:
// In a service class
public function compareContent($oldContent, $newContent)
{
$matcher = app(SequenceMatcher::class);
return $matcher->ratio($oldContent, $newContent);
}
$opCodes = $matcher->getOpCodes($a, $b);
foreach ($opCodes as $op) {
// OP_EQ (1) = equal, OP_DEL (2) = delete, OP_INS (4) = insert, OP_REP (8) = replace
if ($op[0] & SequenceMatcher::OP_REP) {
// Handle replacements
}
}
// In a model observer
public function saved(Model $model)
{
if ($model instanceof Post) {
$similarity = app(SequenceMatcher::class)->ratio(
$model->previous('content'),
$model->content
);
$model->similarity_score = $similarity;
$model->save();
}
}
// In a form request
public function passedValidation()
{
$expected = $this->getExpectedResponse();
$actual = $this->route()->getResponse();
$matcher = app(SequenceMatcher::class);
return $matcher->ratio($expected, $actual) >= config('api.min_similarity');
}
$options = new \Jfcherng\SequenceMatcher\Options();
$options->setAutoJump(true); // Optimize for speed
$options->setThreshold(0.5); // Minimum ratio to consider a match
$matcher = new SequenceMatcher($options);
$isMatch = $matcher->ratio($a, $b) >= $options->getThreshold();
// Dispatch a job for large comparisons
CompareContentJob::dispatch($oldContent, $newContent);
// In the job
public function handle()
{
$matcher = app(SequenceMatcher::class);
$result = $matcher->ratio($this->oldContent, $this->newContent);
// Store or notify based on result
}
public function getCachedSimilarity($key, $a, $b)
{
return cache()->remember("sequence_{$key}", now()->addHours(1), function() use ($a, $b) {
return app(SequenceMatcher::class)->ratio($a, $b);
});
}
public function testSimilarityCalculation()
{
$matcher = new SequenceMatcher();
$this->assertEquals(1.0, $matcher->ratio("hello", "hello"));
$this->assertEquals(0.0, $matcher->ratio("hello", "world"));
$this->assertEquals(0.75, $matcher->ratio("abc", "abcd"));
}
Memory Usage with Large Sequences
memory_get_usage() and implement fallbacks for oversized inputs.Floating-Point Precision
0.9999999999999999 may appear due to floating-point arithmetic.number_format($ratio, 4).Case Sensitivity
$matcher->ratio(strtolower($a), strtolower($b));
Unicode Handling
mb_strtolower() for Unicode-aware case folding.Performance with Arrays
json_encode()). Deeply nested structures may bloat memory.Breaking Changes in v5.0.0+
// Old (pre-5.0.0)
$matcher->ratio($a, $b, SequenceMatcher::OP_NOP);
// New
$options = new Options();
$matcher = new SequenceMatcher($options);
Inspect Operation Codes
getOpCodes() to debug mismatches:
$opCodes = $matcher->getOpCodes($a, $b);
print_r($opCodes); // Analyze deletions/insertions/replacements
Compare with Python
difflib:
from difflib import SequenceMatcher
print(SequenceMatcher(None, "abc", "abcd").ratio()) # Should match PHP output
Profile Performance
bench() helper or Xdebug to identify slow comparisons:
$time = bench()->time(fn() => $matcher->ratio($largeA, $largeB));
Handle Edge Cases
["a", 1] vs. ["a", "1"]).Custom Matcher Logic
SequenceMatcher to add domain-specific rules:
class CustomMatcher extends SequenceMatcher
{
public function ratio($a, $b)
{
// Pre-process inputs (e.g., strip HTML tags)
$a = strip_tags($a);
$b = strip_tags($b);
return parent::ratio($a, $b);
}
}
Laravel Service Provider
$this->app->singleton(SequenceMatcher::class, function() {
$options = new Options();
$options->setAutoJump(true);
return new SequenceMatcher($options);
});
Event-Based Notifications
event(new SequenceMismatch($model, $similarity));
Queueable Comparisons
class CompareContentJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable;
public function handle()
{
$matcher = app(SequenceMatcher::class);
$result = $matcher->ratio($this->oldContent, $this->newContent);
// Store or notify
}
}
Options Object
Options class (not scalar values) in PHP 8.3+:
// Correct
$options = new Options();
$options->setAutoJump(true);
$matcher = new SequenceMatcher($options);
// Incorrect (pre-5.0.0)
$matcher->ratio($a, $b, true);
Threshold Handling
threshold option in Options is not used by ratio(). It’s for internal checks in findLongestMatch().
How can I help you explore Laravel packages today?