Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Tiktoken Laravel Package

yethee/tiktoken

PHP port of OpenAI’s tiktoken tokenizer. Get encoders by model name, encode text to token IDs, and cache vocab files for speed. Optional experimental Rust/FFI “lib mode” for faster encoding of medium/large texts.

View on GitHub
Deep Wiki
Context7

This is a port of the tiktoken

Frequently asked questions about Tiktoken
How do I install yethee/tiktoken in a Laravel project?
Run `composer require yethee/tiktoken` in your project root. No additional setup is needed for basic usage. The package supports PHP 8.1+ and integrates cleanly with Laravel’s Composer dependency system.
Which GPT models are supported out of the box?
The package supports all OpenAI models using the standard tiktoken vocabularies, including GPT-3.5 (e.g., `gpt-3.5-turbo-0301`), GPT-4, GPT-5, and embeddings. Check the [changelog](https://github.com/yethee/tiktoken-php/blob/master/CHANGELOG.md) for the latest supported models.
Can I use this package to validate token counts before calling OpenAI’s API?
Yes. Use `$encoder->encode()` to pre-tokenize user input and compare against your model’s token limit (e.g., 8,192 for GPT-4). Integrate this logic in Laravel middleware or a service layer to reject oversized prompts early.
How does the optional FFI/LibEncoder mode improve performance?
The experimental `LibEncoder` uses Rust’s `tiktoken-rs` via FFI for ~2x faster encoding on large texts (e.g., >10,000 tokens). However, it requires building native libraries and adds setup complexity. Benchmark it for your workload—it may not help for small texts due to marshalling overhead.
Where should I store the vocabulary cache for production?
Use Laravel’s cache system (e.g., Redis) instead of the filesystem for `TIKTOKEN_CACHE_DIR`. Set the path via `EncoderProvider::setVocabCache()` or the `TIKTOKEN_CACHE_DIR` environment variable. This avoids I/O bottlenecks in high-traffic apps.
How do I integrate tokenization into Laravel queues for batch processing?
Dispatch a job with the text payload, then use `$encoder->encode()` inside the job’s `handle()` method. For example: `TokenizeTextJob::dispatch($userInput)->onQueue('tokenization');`. This offloads heavy tokenization from API requests.
Are there any known issues with token counts matching OpenAI’s SDK?
The package replicates OpenAI’s tokenization logic, but edge cases (e.g., rare Unicode characters) may diverge. Validate counts against OpenAI’s Python SDK in unit tests. Report discrepancies to the [GitHub issues](https://github.com/yethee/tiktoken-php/issues).
Can I use this package with GPT-2 or custom models requiring special tokens?
No. This package only supports models using standard tiktoken vocabularies (e.g., `p50k_base`). GPT-2 and models with custom tokens (e.g., `<|startoftext|>`) are not compatible. Check the [supported models list](https://github.com/openai/tiktoken#supported-models) for details.
How do I configure the package to use Redis for vocabulary caching?
Set `TIKTOKEN_CACHE_DIR` to a Redis-backed path (e.g., `redis://127.0.0.1:6379/0`). Alternatively, wrap the cache in Laravel’s `Cache` facade: `Cache::put('tiktoken/vocab', $vocabData, now()->addHours(1));` and override `EncoderProvider::getVocabCache()`.
What are the tradeoffs of using LibEncoder in production?
LibEncoder offers speed but requires Rust toolchain setup and native library builds. Avoid it unless profiling shows a bottleneck (>10k tokens/sec). Test thoroughly—unstable builds or misconfigured `LD_LIBRARY_PATH` can crash PHP. Use the native encoder for simplicity unless performance demands it.
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
hamzi/corewatch
minionfactory/raw-hydrator
hexters/coinpayment
rjcodes/rjcms
act-training/laravel-permissions-manager
alimarchal/laravel-chart-of-accounts
babenkoivan/elastic-scout-driver
mkwebdesign/filament-watchdog-v5
renatomarinho/laravel-page-speed
zedmagdy/filament-business-hours
renatovdemoura/blade-elements-ui
devgeek/beacon-admin
benjamin-rqt/data-watcher-bundle
atriumphp/atrium
sandermuller/package-boost-laravel
sandermuller/boost-skills
redaxo/core
yusufgenc/filament-api-forge
l3aro/rating-star-for-filament
leek/filament-subtenant-scope