droath/laravel-text-chunker
Flexible Laravel text chunking for AI/LLM apps. Split content into smaller chunks by characters, tokens, sentences, or markdown-aware rules. Fluent, strategy-based API ideal for fitting token limits, RAG pipelines, and custom domain splitting.
The package employs a strategy pattern with a fluent Laravel API, aligning well with modern Laravel architecture. It integrates seamlessly via auto-discovery and service container binding, with minimal dependencies (tiktoken, spatie/laravel-package-tools). The immutable chunk value objects and deferred validation enhance developer experience. However, the 0 stars, 0 dependents, and v1.0.0 initial release status indicate low real-world adoption. Key risks include untested edge cases in markdown/sentence strategies (e.g., complex nested markdown), potential tokenization inaccuracies for non-OpenAI models, and PHP 8.3+ requirement limiting compatibility with older environments. Critical questions: How does it handle non-UTF-8 text or multilingual sentence boundaries? What's the performance impact for 100MB+ text processing? Are there known issues with markdown code block preservation during overlap?
How can I help you explore Laravel packages today?