yethee/tiktoken
PHP port of OpenAI tiktoken for fast tokenization. Get encoders by model or encoding name, encode text to token IDs, with default vocab caching and configurable cache dir. Optional experimental FFI lib mode (tiktoken-rs) for better performance on larger texts.
EncoderProvider can be registered as a singleton in AppServiceProvider or bound to interfaces for loose coupling.TokenService consumed by:
ValidatePromptLength).ChatController::validatePrompt()).GenerateEmbeddingsJob).o1/o3) via getForModel(), reducing duplication in multi-model apps. Critical for scaling AI features without per-model tokenization logic.TIKTOKEN_CACHE_DIR or EncoderProvider::setVocabCache(). Cache invalidation is handled automatically (checksum-based).<|endofprompt|>. Workaround: Pre/post-process text or use a hybrid approach (e.g., regex + tiktoken).encodeInChunks() is unimplemented (TODO). Risk: May require custom logic for large texts (e.g., splitting prompts).Swoole extension.LibEncoder adds dependencies (Rust, FFI).EncoderProvider as a singleton:
$app->singleton(EncoderProvider::class, fn() => new EncoderProvider());
public function handle(Request $request, Closure $next) {
$encoder = app(EncoderProvider::class)->getForModel('gpt-4');
$tokens = $encoder->encode($request->prompt);
if (count($tokens) > config('ai.max_tokens')) {
abort(422, 'Prompt exceeds token limit.');
}
return $next($request);
}
public function handle() {
$encoder = app(EncoderProvider::class)->get('p50k_base');
$tokens = $encoder->encode($this->document->text);
// Store tokens for later use (e.g., embeddings)
}
Prompt::update(['token_count' => count($tokens)])) for analytics or billing.0.12.0 (1.5 years). Backward-compatible with Laravel’s long-term support (LTS) cycles.composer require + 10 lines of code for basic usage.1.1.1, but edge cases may exist in high-concurrency environments (e.g., shared hosting). Mitigation: Use Redis cache instead of filesystem.LibEncoder or a microservice (e.g., Python tiktoken) if needed.LibEncoder is worth the risk.LibEncoder or consider alternatives.encodeInChunks() may require custom logic.LibEncoder is adopted?EncoderProvider as a singleton or interface.php-ai/php-ai) for unified tokenization/API calls.composer bench).composer require yethee/tiktoken
prompts).EncoderProvider to service container.LibEncoder if performance is critical (requires Rust setup).encodeInChunks() is needed.LibEncoder if needed.o1/o3). Check releases for updates.LibEncoder requires:
How can I help you explore Laravel packages today?