Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Tokenizer Laravel Package

theseer/tokenizer

A PHP tokenizer library for parsing PHP source code into tokens, built on top of ext/tokenizer. Used by tools like PHPUnit to analyze code, inspect syntax, and support static analysis, refactoring, and code quality workflows.

View on GitHub
Deep Wiki
Context7

Getting Started

Install via Composer: composer require theseer/tokenizer. Start by instantiating TheSeer\Tokenizer\Tokenizer and passing PHP source code to tokenize:

use TheSeer\Tokenizer\Tokenizer;

$tokenizer = new Tokenizer();
$tokens = $tokenizer->tokenize('<?php echo "Hello";');
foreach ($tokens as $token) {
    echo sprintf("[%s] %s\n", $token->getName(), $token->getValue());
}

The first use case is often static inspection — e.g., counting function declarations or finding @deprecated docblocks. The tokenizer returns an iterable of Token objects with methods getName(), getValue(), getLine(), and getType().

Implementation Patterns

  • Stream-based processing: Use TokenStream to group tokens (e.g., class/method blocks) with TokenStream::filter(), TokenStream::nextUntil(), or custom iterators.
  • Replaying token sequences: For refactoring, reconstruct code using TokenStream::serialize() or manually emit tokens back to code.
  • Context-aware filtering: Combine with TokenFilter to skip whitespace/comments via Tokenizer::removeWhitespace() for cleaner analysis.
  • Integration in tools: Often used in custom sniffers (e.g., alongside PHPCS), code quality metrics, or AST-less analyzers where speed matters more than full parse fidelity.
  • Version-safe parsing: Leverage the library’s handling of token name differences across PHP versions (e.g., T_INLINE_HTML, T_FN).

Gotchas and Tips

  • No position tracking: Unlike full ASTs, this provides no column offset — only line numbers. Use Token::getLine() for debugging but avoid fine-grained edits.
  • Whitespace removal: By default, whitespace is included. Use $tokenizer->removeWhitespace() if you need pure semantic tokens — but be aware this may change token positions if you later reconstruct code.
  • Open tag expectations: The tokenizer expects <?php — omitting it may cause unexpected T_INLINE_HTML tokens. Always normalize input.
  • Token naming quirk: getName() returns PHP userland token names (e.g., "T_STRING"), not human-friendly labels — you may need to map these.
  • Extensibility: Extend Token or Tokenizer to add custom token types if building domain-specific tooling (e.g., custom annotations), but maintain compatibility by not overriding TokenStream logic lightly.
  • Debugging tip: Use Token::export() or dump the token stream to a file for visual inspection during development.
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
davejamesmiller/laravel-breadcrumbs
artisanry/parsedown
christhompsontldr/phpsdk
enqueue/dsn
bunny/bunny
enqueue/test
enqueue/null
enqueue/amqp-tools
bower-asset/punycode
bower-asset/inputmask
bower-asset/jquery
bower-asset/yii2-pjax
laravel/nova
spatie/laravel-mailcoach
spatie/laravel-superseeder
laravel/liferaft
nst/json-test-suite
danielmiessler/sec-lists
jackalope/jackalope-transport
twbs/bootstrap4