- How do I install tiktoken-php in a Laravel project?
- Run `composer require yethee/tiktoken` in your project root. The package has no Laravel-specific dependencies and integrates seamlessly with the service container. Register the `EncoderProvider` as a singleton in `AppServiceProvider` for global access.
- Which Laravel versions does this package support?
- The package is framework-agnostic but works with Laravel 8.x and later. It leverages PHP 8.0+ features and has no version-specific dependencies. Tested with Laravel’s long-term support (LTS) cycles, so it aligns with most modern Laravel apps.
- Can I use this for tokenizing prompts in Laravel middleware?
- Yes. Inject the `EncoderProvider` via the service container and validate token counts before processing requests. Example: `app(EncoderProvider::class)->getForModel('gpt-4')->encode($request->prompt)` to check against `config('ai.max_tokens')` in middleware.
- Does tiktoken-php support GPT-2 or custom vocabularies?
- No, GPT-2 is unsupported. For custom vocabularies, you’ll need to pre-process text or use a hybrid approach (e.g., regex + tiktoken). The package focuses on OpenAI’s standard models (GPT-3.5/4/5, embeddings, etc.).
- How do I cache vocabularies in Laravel’s Redis instead of the filesystem?
- Set the `TIKTOKEN_CACHE_DIR` environment variable to a Redis-backed path (e.g., `redis://127.0.0.1:6379/0`). Alternatively, use `EncoderProvider::setVocabCache()` with a custom cache adapter that writes to Redis.
- What’s the performance impact of using the experimental LibEncoder?
- LibEncoder (Rust/FFI) offers 2–5x speed for large texts but adds complexity. Benchmark first—it’s only viable if tokenization is a bottleneck (e.g., batch jobs). For small texts, the native PHP encoder (~5–10ms per 1k tokens) is sufficient for most Laravel use cases.
- Can I use this package in Laravel queues/jobs for batch processing?
- Yes. Pre-tokenize documents in background jobs (e.g., `GenerateEmbeddingsJob`) and store tokenized data for later use. The synchronous API is fine for queues, but offload heavy processing to avoid blocking I/O-bound tasks.
- Are there alternatives to tiktoken-php for Laravel?
- For PHP, this is the most complete OpenAI-compatible port. Alternatives include Python-based solutions (e.g., `tiktoken` via PyExec) or JavaScript libraries (Node.js), but they introduce runtime dependencies. For Rust lovers, `tiktoken-rs` exists but lacks Laravel integration.
- How do I handle tokenization errors in production?
- Wrap encoder calls in try-catch blocks for `InvalidArgumentException` (e.g., unsupported models). Log errors and implement fallback logic (e.g., retry with a different encoder or abort with a 500 response). Cache invalidation is automatic via checksums.
- Does this package work with Laravel’s caching backends (Redis, Memcached)?
- Yes, but indirectly. Use `TIKTOKEN_CACHE_DIR` with a Redis-backed filesystem adapter (e.g., `storedriver/redis-filesystem`) or implement a custom cache adapter. The package itself defaults to PHP’s temp directory but respects your cache configuration.