Installation:
composer require helgesverre/receipt-scanner
php artisan vendor:publish --tag="receipt-scanner-config"
php artisan vendor:publish --provider="OpenAI\Laravel\OpenAIServiceProvider"
Add OPENAI_API_KEY to .env.
First Use Case: Extract data from a receipt image:
use HelgeSverre\ReceiptScanner\Facades\ReceiptScanner;
$result = ReceiptScanner::scanFromImage(public_path('receipt.png'));
dd($result); // Returns structured array of receipt data
config/receipt-scanner.php (adjust OpenAI model, temperature, and prompt settings).HelgeSverre\ReceiptScanner\Facades\ReceiptScanner (primary entry point).scanFromText(string $text)scanFromImage(string $path)scanFromPdf(string $path)scanFromHtml(string $html)Receipt Processing Pipeline:
// Handle file uploads (e.g., from a form)
$file = request()->file('receipt');
$path = $file->store('receipts');
// Scan based on file type
$result = match ($file->getClientOriginalExtension()) {
'png', 'jpg', 'jpeg' => ReceiptScanner::scanFromImage($path),
'pdf' => ReceiptScanner::scanFromPdf($path),
default => ReceiptScanner::scanFromText(file_get_contents($path)),
};
Email Attachment Parsing:
$email = Mail::getMessage(); // From Laravel Notifications or IncomingMail
foreach ($email->attachments as $attachment) {
$data = ReceiptScanner::scanFromImage(storage_path("app/{$attachment->getFilename()}"));
// Store or process $data
}
Batch Processing:
$files = Storage::disk('s3')->files('receipts/inbox');
foreach ($files as $file) {
$data = ReceiptScanner::scanFromPdf(storage_path("app/{$file}"));
// Save to database or queue for async processing
}
Database Storage:
Use Laravel’s HasMany/MorphTo to link receipt data to models (e.g., User, Invoice):
// Migration
Schema::create('receipts', function (Blueprint $table) {
$table->id();
$table->json('data'); // Structured output from scanner
$table->string('source_path');
$table->morphs('owner');
$table->timestamps();
});
Queue Jobs:
Offload scanning to a queue (e.g., scanReceiptJob) to avoid timeouts:
ScanReceiptJob::dispatch($filePath)->onQueue('receipts');
Validation: Validate OpenAI responses with a custom validator:
use HelgeSverre\ReceiptScanner\Rules\ValidReceiptData;
$request->validate([
'receipt' => ['required', new ValidReceiptData],
]);
Fallback Logic: Handle OCR/Textract failures gracefully:
try {
$data = ReceiptScanner::scanFromImage($path);
} catch (\Exception $e) {
Log::error("OCR failed: {$e->getMessage()}");
$data = ReceiptScanner::scanFromText(file_get_contents($path)); // Fallback
}
OpenAI API Limits:
OpenAI::getRateLimit() to avoid hitting quotas.$cacheKey = md5_file($path);
$data = Cache::remember($cacheKey, now()->addHours(1), function () use ($path) {
return ReceiptScanner::scanFromImage($path);
});
File Size Limits:
use HelgeSverre\ReceiptScanner\Services\Preprocessor;
$text = Preprocessor::extractTextFromPdf($path, maxPages: 5);
$data = ReceiptScanner::scanFromText($text);
Prompt Sensitivity:
config/receipt-scanner.php if the package’s prompt misses fields (e.g., tax IDs). Test with edge cases (e.g., handwritten receipts).OCR Dependencies:
AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY are in .env.Structured Data Schema:
class Receipt extends Model {
protected $casts = [
'data' => ReceiptData::class, // Custom cast for validation/access
];
}
Log Raw Responses:
Enable debug mode in config/receipt-scanner.php to log OpenAI responses:
'debug' => env('RECEIPT_SCANNER_DEBUG', false),
Check logs at storage/logs/laravel.log.
Validate Inputs: Sanitize inputs before scanning to avoid malformed data:
$text = preg_replace('/[^a-zA-Z0-9\s.,-]/', '', $rawText);
$data = ReceiptScanner::scanFromText($text);
Test with Known Data: Use a sample receipt (e.g., from this repo’s tests) to verify accuracy.
Custom Extractors: Extend the base scanner by creating a custom extractor:
namespace App\Services;
use HelgeSverre\ReceiptScanner\Contracts\Extractor;
class CustomExtractor implements Extractor {
public function extract($input): array {
// Custom logic (e.g., regex + OpenAI hybrid)
return $this->hybridParse($input);
}
}
Register it in config/receipt-scanner.php:
'extractors' => [
'custom' => App\Services\CustomExtractor::class,
],
Webhook Integration: Trigger scans via a webhook (e.g., from a cloud storage event):
Route::post('/scan-webhook', function (Request $request) {
$fileUrl = $request->input('file_url');
$data = ReceiptScanner::scanFromImage(storage_path("temp/{$request->filename}"));
// Process and return $data
});
Model Observers: Automatically scan uploads on model events:
class ReceiptObserver {
public function saved(Receipt $receipt) {
if ($receipt->wasRecentlyCreated && $receipt->source_path) {
$data = ReceiptScanner::scanFromImage($receipt->source_path);
$receipt->update(['data' => $data]);
}
}
}
Prompt Tuning: Override the default prompt for domain-specific receipts (e.g., medical invoices):
ReceiptScanner::setPrompt(file_get_contents(app_path('Prompts/medical-receipt-prompt.txt')));
How can I help you explore Laravel packages today?