Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Pdfparser Laravel Package

smalot/pdfparser

Standalone PHP library to parse PDF files and extract content. Reads objects/headers, metadata, and ordered page text; supports compressed PDFs and various encodings. Configure parsing via custom configs. Note: no support for secured PDFs or form data.

View on GitHub
Deep Wiki
Context7

Pdf parser library. Can read and extract information from pdf file.

Frequently asked questions about Pdfparser
How do I install smalot/pdfparser in a Laravel project?
Run `composer require smalot/pdfparser` in your project directory. The package requires PHP 7.1+ and integrates seamlessly with Laravel as a standalone library. No additional Laravel-specific dependencies are needed.
Does smalot/pdfparser support encrypted or password-protected PDFs?
No, this package does not support encrypted or password-protected PDFs. If you need to handle secured documents, consider using a fallback like `symfony/process` to call external tools such as `pdftotext` or explore alternatives like `spatie/pdf-to-text`.
Can I extract structured data like tables or images from PDFs with this package?
The package extracts raw text and metadata but does not preserve complex layouts like tables or images. For structured data, you may need to post-process the extracted text using regex, NLP, or additional libraries.
How do I configure memory limits for large PDF files?
Use the `setDecodeMemoryLimit()` method to adjust memory usage for large files. Additionally, set `setRetainImageContent(false)` to reduce memory consumption if you don’t need image data. This is useful for processing files over 100MB.
Is smalot/pdfparser compatible with Laravel 9 or 10?
Yes, the package supports PHP 7.1+, which includes Laravel 9 and 10. However, always check the latest release notes for any breaking changes, as the library is community-maintained.
How can I integrate this package into Laravel’s service container?
Create a service provider to bind the parser as a singleton. For example, register it in `PdfParserServiceProvider` and bind it to the container. You can then use dependency injection in your controllers or services.
What are the alternatives to smalot/pdfparser for PDF parsing in Laravel?
Alternatives include `spatie/pdf-to-text` (a wrapper for other parsers) or `setasign/fpdf` (for PDF generation). If you need encrypted PDF support, consider `mikehaertl/phpwkhtmltopdf` or external tools like `pdftotext`.
How do I test the accuracy of extracted text from PDFs?
Validate extraction accuracy by comparing output against known PDFs with expected text. Use unit tests to verify metadata extraction (e.g., author, creation date) and text order. Test with real-world documents, including scanned and multi-column layouts.
Can I use this package in Laravel background jobs or queues?
Yes, the package is stateless and can be used in Laravel queues for long-running tasks. Dispatch a job like `ParsePdfJob::dispatch($filePath)->onQueue('pdf-parsing')` to offload parsing from HTTP requests.
Does smalot/pdfparser support custom configurations for parsing?
Yes, you can create custom configurations using the `Config` class. This allows you to adjust settings like encoding, memory limits, or text extraction behavior. Refer to the [CustomConfig.md](https://github.com/smalot/pdfparser/blob/master/doc/CustomConfig.md) documentation for details.
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
hexters/coinpayment
rjcodes/rjcms
act-training/laravel-permissions-manager
alimarchal/laravel-chart-of-accounts
babenkoivan/elastic-scout-driver
mkwebdesign/filament-watchdog-v5
renatomarinho/laravel-page-speed
zedmagdy/filament-business-hours
renatovdemoura/blade-elements-ui
devgeek/beacon-admin
benjamin-rqt/data-watcher-bundle
atriumphp/atrium
sandermuller/package-boost-laravel
sandermuller/boost-skills
redaxo/core
yusufgenc/filament-api-forge
l3aro/rating-star-for-filament
leek/filament-subtenant-scope
anil/file-picker
broqit/fields-ai