sanmai/pipeline
sanmai/pipeline is a lightweight PHP pipeline library to process data through a chain of stages. Compose reusable, testable transformations with clear input/output flow, and plug in custom middleware-like steps for flexible processing in any app.
This guide provides techniques for optimizing your pipelines for speed and memory efficiency.
The library's core strength is its ability to process large datasets with minimal memory usage through streaming. This is achieved by using iterators and generators, which process data one element at a time.
Example: Finding Errors in a Large Log File
Consider the task of finding the first five "ERROR" lines in a 10 GB log file.
The Inefficient Way (Loading into Memory)
// Warning: This will likely exhaust your server's memory.
$lines = file('huge-10GB.log'); // Loads the entire 10GB file into memory
$errors = take($lines)
->filter(fn($line) => str_contains($line, 'ERROR'))
->slice(0, 5)
->toList();
The Efficient Way (Streaming)
// This is memory-safe and fast.
$errors = take(new SplFileObject('huge-10GB.log'))
->filter(fn($line) => str_contains($line, 'ERROR'))
->slice(0, 5)
->toList();
The streaming approach is significantly faster and more memory-efficient because it reads the file line by line and stops as soon as the required number of errors has been found.
stream() MethodThe stream() method is a powerful tool for controlling how your pipeline processes data. It converts the pipeline to use generators, forcing element-by-element processing instead of batch operations.
By default, when working with arrays, the library optimizes certain operations:
// Without stream(): Creates intermediate arrays
$result = take($largeArray)
->filter($predicate) // Creates a new filtered array in memory
->map($transformer) // Creates another transformed array
->toList();
With stream(), each element flows through the entire pipeline before the next one starts:
// With stream(): Processes one element at a time
$result = take($largeArray)
->stream() // Convert to generator
->filter($predicate) // Element passes through filter...
->map($transformer) // ...then immediately through map
->toList();
stream()Use stream() when:
stream() significantly reduces peak memory usagestream() for large arrays or when memory is constrainedFor convenience, some methods are optimized for speed when working with small arrays:
filter()cast()slice()chunk()These methods use native PHP array functions internally, which can be faster for small datasets. However, they create intermediate arrays in memory, so they should be used with caution.
chunk() method to process large datasets in smaller, more manageable batches.finally block.Before optimizing, always profile your code to identify the actual bottlenecks. Use tools like Xdebug or Blackfire to get a clear picture of your pipeline's performance.
How can I help you explore Laravel packages today?