Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Laravel Text Chunker Laravel Package

droath/laravel-text-chunker

Flexible Laravel text chunking for AI/LLM apps. Split content into smaller chunks by characters, tokens, sentences, or markdown-aware rules. Fluent, strategy-based API ideal for fitting token limits, RAG pipelines, and custom domain splitting.

View on GitHub
Deep Wiki
Context7
v1.0.0

Initial release of Laravel Text Chunker - a flexible, strategy-based text chunking package for Laravel applications.

Features

Core Architecture

  • Strategy Pattern: Implemented flexible strategy-based architecture for text chunking
  • Fluent API: Chainable method calls for intuitive usage (strategy()->size()->overlap()->chunk())
  • Immutable Chunks: Readonly value objects with text, index, and position metadata
  • Lazy Validation: Validation deferred to execution time for better developer experience
  • Laravel Integration: Service provider, facade, and auto-discovery support

Built-in Strategies

  • Character Strategy: Split text at exact character count boundaries with multibyte UTF-8 support
  • Token Strategy: Split text by OpenAI token count using tiktoken library for optimal API usage
  • Sentence Strategy: Split text at sentence boundaries with configurable abbreviation handling
  • Markdown Strategy: Preserve markdown structure (code blocks, headers, lists, blockquotes, horizontal rules) while chunking

Advanced Capabilities

  • Overlap Support: Percentage-based overlap (0-100%) for context preservation across chunks
  • Custom Strategies: Easy registration of custom chunking strategies via interface implementation
  • Position Tracking: Accurate character position tracking (start_position, end_position) for all chunks
  • Configurable Options: Strategy-specific options (token model selection, custom abbreviations, etc.)

Developer Experience

  • Comprehensive Documentation: Full PHPDoc coverage for all public APIs
  • Descriptive Exceptions: Clear, actionable error messages with available options listed
  • Type Safety: Strict types throughout with PHP 8.3+ modern syntax
  • Extensive Testing: 103 tests with 522 assertions covering all functionality
  • Code Quality: PSR-12 compliant via Laravel Pint, PHPStan level 5 static analysis

Configuration

  • Publishable configuration file for default strategy and strategy-specific settings
  • Auto-registration of custom strategies from config
  • Token model configuration (gpt-4, gpt-3.5-turbo, etc.)
  • Sentence abbreviations configuration (Dr., Mr., Mrs., Ms., etc.)

Requirements

  • PHP 8.3 or higher
  • Laravel 11.x or 12.x

Dependencies

  • yethee/tiktoken ^0.12.0 - OpenAI token encoding/decoding
  • spatie/laravel-package-tools ^1.16 - Package bootstrapping utilities

Testing

  • Complete test coverage with Pest 4.x framework
  • Unit tests for all strategies, manager, and core components
  • Feature tests for end-to-end workflows and integration
  • Validation tests for all error conditions
  • Strategic gap tests for edge cases and real-world scenarios

Notes

  • This is the MVP release focused on core chunking functionality
  • All public APIs are considered stable
  • Future releases will maintain backward compatibility
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
davejamesmiller/laravel-breadcrumbs
artisanry/parsedown
christhompsontldr/phpsdk
enqueue/dsn
bunny/bunny
enqueue/test
enqueue/null
enqueue/amqp-tools
milesj/emojibase
bower-asset/punycode
bower-asset/inputmask
bower-asset/jquery
bower-asset/yii2-pjax
laravel/nova
spatie/laravel-mailcoach
spatie/laravel-superseeder
laravel/liferaft
nst/json-test-suite
danielmiessler/sec-lists
jackalope/jackalope-transport