Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Tiktoken Laravel Package

yethee/tiktoken

PHP port of OpenAI tiktoken for fast tokenization. Get encoders by model or encoding name, encode text to token IDs, with default vocab caching and configurable cache dir. Optional experimental FFI lib mode (tiktoken-rs) for better performance on larger texts.

View on GitHub
Deep Wiki
Context7

yethee/tiktoken is a PHP port of OpenAI’s tiktoken tokenizer, letting you encode text into token IDs using the same model-specific vocabularies as popular GPT models. It’s designed for practical use in PHP/Laravel apps where accurate token counting and chunking matters.

It includes built-in vocabulary caching for speed and can optionally use an FFI-backed Rust library for higher throughput on medium/large inputs.

  • Model-aware encoders via EncoderProvider::getForModel() (e.g., GPT-3.5 variants)
  • Direct access to named encodings (e.g., p50k_base)
  • Automatic vocab caching (override via TIKTOKEN_CACHE_DIR or setVocabCache())
  • Experimental Lib mode using tiktoken-rs through FFI (TIKTOKEN_LIB_PATH)
  • Simple API: $encoder->encode('Hello world!')
Frequently asked questions about Tiktoken
How do I install tiktoken-php in a Laravel project?
Run `composer require yethee/tiktoken` in your project root. The package has no Laravel-specific dependencies and integrates seamlessly with the service container. Register the `EncoderProvider` as a singleton in `AppServiceProvider` for global access.
Which Laravel versions does this package support?
The package is framework-agnostic but works with Laravel 8.x and later. It leverages PHP 8.0+ features and has no version-specific dependencies. Tested with Laravel’s long-term support (LTS) cycles, so it aligns with most modern Laravel apps.
Can I use this for tokenizing prompts in Laravel middleware?
Yes. Inject the `EncoderProvider` via the service container and validate token counts before processing requests. Example: `app(EncoderProvider::class)->getForModel('gpt-4')->encode($request->prompt)` to check against `config('ai.max_tokens')` in middleware.
Does tiktoken-php support GPT-2 or custom vocabularies?
No, GPT-2 is unsupported. For custom vocabularies, you’ll need to pre-process text or use a hybrid approach (e.g., regex + tiktoken). The package focuses on OpenAI’s standard models (GPT-3.5/4/5, embeddings, etc.).
How do I cache vocabularies in Laravel’s Redis instead of the filesystem?
Set the `TIKTOKEN_CACHE_DIR` environment variable to a Redis-backed path (e.g., `redis://127.0.0.1:6379/0`). Alternatively, use `EncoderProvider::setVocabCache()` with a custom cache adapter that writes to Redis.
What’s the performance impact of using the experimental LibEncoder?
LibEncoder (Rust/FFI) offers 2–5x speed for large texts but adds complexity. Benchmark first—it’s only viable if tokenization is a bottleneck (e.g., batch jobs). For small texts, the native PHP encoder (~5–10ms per 1k tokens) is sufficient for most Laravel use cases.
Can I use this package in Laravel queues/jobs for batch processing?
Yes. Pre-tokenize documents in background jobs (e.g., `GenerateEmbeddingsJob`) and store tokenized data for later use. The synchronous API is fine for queues, but offload heavy processing to avoid blocking I/O-bound tasks.
Are there alternatives to tiktoken-php for Laravel?
For PHP, this is the most complete OpenAI-compatible port. Alternatives include Python-based solutions (e.g., `tiktoken` via PyExec) or JavaScript libraries (Node.js), but they introduce runtime dependencies. For Rust lovers, `tiktoken-rs` exists but lacks Laravel integration.
How do I handle tokenization errors in production?
Wrap encoder calls in try-catch blocks for `InvalidArgumentException` (e.g., unsupported models). Log errors and implement fallback logic (e.g., retry with a different encoder or abort with a 500 response). Cache invalidation is automatic via checksums.
Does this package work with Laravel’s caching backends (Redis, Memcached)?
Yes, but indirectly. Use `TIKTOKEN_CACHE_DIR` with a Redis-backed filesystem adapter (e.g., `storedriver/redis-filesystem`) or implement a custom cache adapter. The package itself defaults to PHP’s temp directory but respects your cache configuration.
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
davejamesmiller/laravel-breadcrumbs
artisanry/parsedown
christhompsontldr/phpsdk
enqueue/dsn
bunny/bunny
enqueue/test
enqueue/null
enqueue/amqp-tools
milesj/emojibase
bower-asset/punycode
bower-asset/inputmask
bower-asset/jquery
bower-asset/yii2-pjax
laravel/nova
spatie/laravel-mailcoach
spatie/laravel-superseeder
laravel/liferaft
nst/json-test-suite
danielmiessler/sec-lists
jackalope/jackalope-transport