spatie/pdf-to-text
Extract text from PDF files in PHP using Spatie’s pdf-to-text wrapper around the pdftotext binary (Poppler/Xpdf). Simple API (Pdf::getText), supports custom binary paths and options, ideal for Laravel apps needing fast PDF text extraction.
composer require spatie/pdf-to-text
poppler system dependency (required!):
sudo apt install poppler-utilsbrew install popplerpdftotext.exe is in PATH.use Spatie\PdfToText\Pdf;
$text = (new Pdf('path/to/document.pdf'))->text();
$text = (new Pdf($request->file('pdf')))
->setPdfPath('custom.pdf')
->options(['-layout']) // preserve layout
->text();
foreach ($pdfPaths as $path) {
$job->dispatch((new ExtractPdfJob($path))->onQueue('ocr'));
}
UploadedFile instances:
$pdf = $request->file('upload');
$text = (new Pdf($pdf))->text();
pdftotext path if needed (e.g., non-standard deployment):
Pdf::setBinaryPath('/opt/poppler/bin/pdftotext');
pdf may be corrupted or binary missing:
try {
$text = (new Pdf($path))->text();
} catch (\Spatie\PdfToText\Exceptions\BinaryNotFound $e) {
// Handle missing poppler
}
which/exec disabled: Ensure shell_exec() is enabled in php.ini and open_basedir allows temp paths.->options(['-layout']) for tables/columns, but test thoroughly—complex PDFs may still misalign text.spatie/pdf-to-image → tesseract for OCR.mb_convert_encoding() post-extraction if needed.Pdf::setBinaryPath('/bin/true') + override text() in tests, or stub the class.php artisan vendor:publish --tag="pdf-to-text-config" to set defaults globally.How can I help you explore Laravel packages today?