Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Stop Words Laravel Package

yooper/stop-words

Collection of stop-word lists gathered from public sources for filtering text in multiple languages. Includes one-word-per-line files and welcomes contributions and validation ideas to improve stop-word quality.

View on GitHub
Deep Wiki
Context7

Getting Started

Install via Composer:

composer require yooper/stop-words

Begin by instantiating the StopWords service and using default English stop words:

use Yooper\StopWords\StopWords;

$stopWords = new StopWords();
$clean = $stopWords->filter('The quick brown fox jumps over the lazy dog');
// Result: 'quick brown fox jumps over lazy dog'

First use case: clean user-submitted search queries before passing them to your search engine (e.g., Elasticsearch) to reduce noise and improve relevance.

Implementation Patterns

  • Laravel Integration: Bind StopWords in a service provider or use it directly in jobs, services, or models (e.g., App\Services\TextCleaner).
  • Custom Lists: Load language-specific dictionaries (en, fr, es, etc.) or extend with your own via addStopWords(array $words) or setStopWords(array $words).
  • Token Array Workflow: When tokenizing text manually (e.g., for keyword extraction), use isStopWord(string $word) to filter tokens conditionally:
    $tokens = preg_split('/\s+/', 'Data science is an interdisciplinary field');
    $filtered = array_filter($tokens, fn($w) => !$stopWords->isStopWord($w));
    
  • Middleware: Apply to incoming request text fields in API or form validation pipelines to normalize user input before storage/indexing.
  • Opt-in Strategy: Only filter after validating language (e.g., setLanguage('en')) or use strict(false) to preserve case-insensitive matches (e.g., “The” → removed).

Gotchas and Tips

  • ⚠️ Stale Language List: The last release was in 2017—stop-word sets may be outdated or incomplete for modern web content (e.g., missing slang, hashtags, emojis). Always test against real-world data.
  • 🔍 Case Sensitivity: By default, matching is case-insensitive (isStopWord('THE') returns true). Use setStrict(true) if you need case-sensitive control.
  • 🧱 Whitespace Preservation: filter() collapses multiple spaces. Use preg_replace('/\s+/', ' ', trim($result)) post-filter if precise spacing matters.
  • 🛠 Extensibility: Override built-in lists by calling setStopWords() with your custom array—ideal for domain-specific terms (e.g., removing “CPU”, “RAM” from tech product descriptions).
  • 🚫 No Unicode Awareness: The package likely assumes ASCII; handle multilingual text with care—test accented characters (e.g., “café”, “naïve”) to ensure they aren’t misclassified.
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
davejamesmiller/laravel-breadcrumbs
artisanry/parsedown
christhompsontldr/phpsdk
enqueue/dsn
bunny/bunny
enqueue/test
enqueue/null
enqueue/amqp-tools
milesj/emojibase
bower-asset/punycode
bower-asset/inputmask
bower-asset/jquery
bower-asset/yii2-pjax
laravel/nova
spatie/laravel-mailcoach
spatie/laravel-superseeder
laravel/liferaft
nst/json-test-suite
danielmiessler/sec-lists
jackalope/jackalope-transport