The ingestion pipeline converts files, text, and URLs into searchable vectors.
use LarAIgent\AiKit\Services\Ingestion\IngestionService;
$ingestion = app(IngestionService::class);
// Upload from request
$asset = $ingestion->ingestFile($request->file('document'), scope: ['chatbot_id' => 42]);
// Re-ingest file already stored on a Laravel disk
$asset = $ingestion->ingestFromDisk('public', 'knowledge/report.pdf', scope: ['chatbot_id' => 42]);
// Raw text
$asset = $ingestion->ingestText('Company policy text...', name: 'Policy', scope: ['chatbot_id' => 42]);
// URL (safety checks run immediately; fetch/parse/chunk runs in jobs)
$asset = $ingestion->ingestUrl('https://example.com/docs/faq', scope: ['chatbot_id' => 42]);
All entry points now use queued jobs for heavy work:
queued -> parsing -> chunking -> embedding -> indexed
\-> failed
Jobs involved:
ParseAssetJob for uploaded/disk filesProcessTextAssetJob for raw text ingestionFetchUrlAssetJob for URL fetch + content extractionChunkAssetJob for chunk creationEmbedChunksJob for embedding + vector upsert| State | Meaning |
|---|---|
queued |
Accepted and waiting for queued processing |
parsing |
Parsing/extraction stage (including URL fetch) |
chunking |
Chunk creation in progress |
embedding |
Embeddings + vector upsert in progress |
indexed |
Successfully indexed and searchable |
failed |
Terminal failure (error column contains details) |
| Event | Fires When | Timing |
|---|---|---|
IngestionStateChanged |
Every state transition | Immediate |
AssetIndexed |
Terminal success | After commit when inside transaction |
AssetFailed |
Terminal failure | After commit when inside transaction |
Use terminal events for business workflows that depend on committed domain writes:
use Illuminate\Support\Facades\Event;
use LarAIgent\AiKit\Events\AssetIndexed;
Event::listen(AssetIndexed::class, function ($event) {
KnowledgeBase::where('ai_asset_id', $event->asset->id)->update([
'status' => 'indexed',
'chunk_count' => $event->ingestion->chunk_count,
'indexed_at' => now(),
]);
});
QUEUE_CONNECTION=sync still executes inline (Laravel default behavior).database, redis, etc. and run workers.QUEUE_CONNECTION=database
php artisan queue:work
| Type | MIME | Parser |
|---|---|---|
| Text | text/plain, text/markdown, text/csv |
Built-in |
| HTML | text/html, application/xhtml+xml |
Built-in |
application/pdf |
smalot/pdfparser |
|
| DOCX | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
phpoffice/phpword |
How can I help you explore Laravel packages today?