Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Laravel Text Chunker Laravel Package

droath/laravel-text-chunker

Flexible Laravel text chunking for AI/LLM apps. Split content into smaller chunks by characters, tokens, sentences, or markdown-aware rules. Fluent, strategy-based API ideal for fitting token limits, RAG pipelines, and custom domain splitting.

View on GitHub
Deep Wiki
Context7
v1.0.0

Initial release of Laravel Text Chunker - a flexible, strategy-based text chunking package for Laravel applications.

Features

Core Architecture

  • Strategy Pattern: Implemented flexible strategy-based architecture for text chunking
  • Fluent API: Chainable method calls for intuitive usage (strategy()->size()->overlap()->chunk())
  • Immutable Chunks: Readonly value objects with text, index, and position metadata
  • Lazy Validation: Validation deferred to execution time for better developer experience
  • Laravel Integration: Service provider, facade, and auto-discovery support

Built-in Strategies

  • Character Strategy: Split text at exact character count boundaries with multibyte UTF-8 support
  • Token Strategy: Split text by OpenAI token count using tiktoken library for optimal API usage
  • Sentence Strategy: Split text at sentence boundaries with configurable abbreviation handling
  • Markdown Strategy: Preserve markdown structure (code blocks, headers, lists, blockquotes, horizontal rules) while chunking

Advanced Capabilities

  • Overlap Support: Percentage-based overlap (0-100%) for context preservation across chunks
  • Custom Strategies: Easy registration of custom chunking strategies via interface implementation
  • Position Tracking: Accurate character position tracking (start_position, end_position) for all chunks
  • Configurable Options: Strategy-specific options (token model selection, custom abbreviations, etc.)

Developer Experience

  • Comprehensive Documentation: Full PHPDoc coverage for all public APIs
  • Descriptive Exceptions: Clear, actionable error messages with available options listed
  • Type Safety: Strict types throughout with PHP 8.3+ modern syntax
  • Extensive Testing: 103 tests with 522 assertions covering all functionality
  • Code Quality: PSR-12 compliant via Laravel Pint, PHPStan level 5 static analysis

Configuration

  • Publishable configuration file for default strategy and strategy-specific settings
  • Auto-registration of custom strategies from config
  • Token model configuration (gpt-4, gpt-3.5-turbo, etc.)
  • Sentence abbreviations configuration (Dr., Mr., Mrs., Ms., etc.)

Requirements

  • PHP 8.3 or higher
  • Laravel 11.x or 12.x

Dependencies

  • yethee/tiktoken ^0.12.0 - OpenAI token encoding/decoding
  • spatie/laravel-package-tools ^1.16 - Package bootstrapping utilities

Testing

  • Complete test coverage with Pest 4.x framework
  • Unit tests for all strategies, manager, and core components
  • Feature tests for end-to-end workflows and integration
  • Validation tests for all error conditions
  • Strategic gap tests for edge cases and real-world scenarios

Notes

  • This is the MVP release focused on core chunking functionality
  • All public APIs are considered stable
  • Future releases will maintain backward compatibility
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
elnasnato/laraliveui
labrodev/rest-sdk
sampaui/sampaui
babelqueue/php-sdk
facebook/capi-param-builder-php
babelqueue/symfony
hamzi/corewatch
minionfactory/raw-hydrator
hexters/coinpayment
rjcodes/rjcms
act-training/laravel-permissions-manager
alimarchal/laravel-chart-of-accounts
babenkoivan/elastic-scout-driver
mkwebdesign/filament-watchdog-v5
renatomarinho/laravel-page-speed
zedmagdy/filament-business-hours
renatovdemoura/blade-elements-ui
devgeek/beacon-admin
benjamin-rqt/data-watcher-bundle
atriumphp/atrium