theseer/tokenizer
A PHP tokenizer library for parsing PHP source code into tokens, built on top of ext/tokenizer. Used by tools like PHPUnit to analyze code, inspect syntax, and support static analysis, refactoring, and code quality workflows.
Install via Composer: composer require theseer/tokenizer. Start by instantiating TheSeer\Tokenizer\Tokenizer and passing PHP source code to tokenize:
use TheSeer\Tokenizer\Tokenizer;
$tokenizer = new Tokenizer();
$tokens = $tokenizer->tokenize('<?php echo "Hello";');
foreach ($tokens as $token) {
echo sprintf("[%s] %s\n", $token->getName(), $token->getValue());
}
The first use case is often static inspection — e.g., counting function declarations or finding @deprecated docblocks. The tokenizer returns an iterable of Token objects with methods getName(), getValue(), getLine(), and getType().
TokenStream to group tokens (e.g., class/method blocks) with TokenStream::filter(), TokenStream::nextUntil(), or custom iterators.TokenStream::serialize() or manually emit tokens back to code.TokenFilter to skip whitespace/comments via Tokenizer::removeWhitespace() for cleaner analysis.T_INLINE_HTML, T_FN).Token::getLine() for debugging but avoid fine-grained edits.$tokenizer->removeWhitespace() if you need pure semantic tokens — but be aware this may change token positions if you later reconstruct code.<?php — omitting it may cause unexpected T_INLINE_HTML tokens. Always normalize input.getName() returns PHP userland token names (e.g., "T_STRING"), not human-friendly labels — you may need to map these.Token or Tokenizer to add custom token types if building domain-specific tooling (e.g., custom annotations), but maintain compatibility by not overriding TokenStream logic lightly.Token::export() or dump the token stream to a file for visual inspection during development.How can I help you explore Laravel packages today?