parsica-php/parsica
Parsica is a PHP parser combinator library for building custom parsers from small reusable pieces. Compose complex grammars with a fluent API, parse strings into structured results, and handle errors cleanly—ideal for DSLs, config formats, and language tooling.
Internally, Parsica is designed using paradigms from functional programming. We list them here for anybody who's interested in FP, but you don't need to know them to work with Parsica.
Throughout this document, $parser1 ≡ $parser2 means that you can swap $parser1 with $parser2 and vice-versa, and it will not affect the outcome of your program.
Almost all the code is pure and referentially transparent. A notable exception is the combo of recursive() and Parser::recurse(). The latter mutates a Parser. We constrained this so that you can't use the parser when it's not set up yet, and after calling recurse(), you can't call it again. So not strictly pure, but close enough not to matter much in practice.
The combinators are all pure. Some combinators are implemented as instance methods on Parser, but these are also pure. You can think of them as functions that take $this as the first argument.
$parser1->combinator($parser2)
≡ combinator($parser1, $parser2)
In fact, very often there are both a function and an instance method for the same combinator, where one is an alias for the other.
There are no generics in PHP 7.4, but we use the Psalm static typechecker to simulate some of it. The two type are really Parser<T> and ParseResult<T>, where T is the type of the resulting output in the case of a successful parse.
ParseResult<T> is approximately an Either<ParseFailure, ParseSuccess<T>> type.
ParseResult and Parser are functors, using the map method.
For ParseResult, the function is only applied to the output if ParseResult::isSuccess() is true, and ignored in other cases.
Similarly, mapping over Parser is really mapping over the future ParseResult.
ParseResult<T> is a monoid under the ParseResult::append() operation, when T is a monoid as well. discard() is the zero value.
Parser<T> is a monoid under the Parser::append(), when T is a monoid as well. nothing() is the zero value.
$parser->append(nothing()) ≡ $parser
nothing()->append($parser) ≡ $parser
$p1->append($p2)->append($p3)
≡ $p1->append($p2->append($p3))
Parser<T> is an applicative functor.
pure() is a parser that will always output its argument, no matter what the input was. Type: T -> Parser<T>.apply() is sequential application, aka <*>. pure($callable)->apply($parser) is a parser that applies $callable to the output of $parser. It works for callables with multiple arguments, if the callable is curried: pure(curry($callable))->apply($p1)->apply($p2). We used matteosister/php-curry to test this, but any method for currying functions should work.keepFirst() and keepSecond() are <* and *> respectively. Both parsers need to succeed but only the result from one of them is returned.pure(identity())->apply($parser) ≡ $parser
pure($f)->apply(pure($x)) ≡ pure($f($x))
$p->apply(pure($x))
≡ pure(fn($f) => $f($x))->apply($p)
// Assuming that
$compose = fn($f, $g) => fn($x) => $f($g($x))
pure($compose)->apply($p1)->apply($p2)->apply($p3)
≡ $p1->apply($p2->apply($p3))
pure($f)->apply($parser) ≡ $parser->map($f)
Parser<T> is a monad.
pure(): see above.sequence() runs two parsers in sequence, dropping the result of the first one. Both parsers consume input. You may know this as >>. The type of sequence is Parser<T> -> Parser<T2> -> Parser<T2>.bind() sequentially composes a parser and a parser-constructing function, passing the output produced by the first parser as an argument to the second. Both parsers consume input. You may know this as >>= or flatmap. Type: Parser<T> -> (T -> Parser<T2>) -> Parser<T2>.Left identity:
bind(pure($a), $f)
≡ pure($a)->bind($f)
≡ $f($a)
Right identity:
bind($parser, 'pure')
≡ $parser->bind('pure')
≡ $parser
Associativity:
$parser->bind($f)->bind($g)
≡ $parser->bind(fn($x) (use $f, $g) => $f($x)->bind($g))
How can I help you explore Laravel packages today?