IngestionService::ingestFromDisk() - new API to ingest files that already exist on any configured Laravel disk (disk + path + scope) without requiring an UploadedFile. Reuses parser validation and the queued file pipeline. (#11)ChatStreamResponse::withOnComplete(callable $cb): static - lets callers chain post-stream persistence logic without rebuilding SSE loops. Callback receives (fullText, usage, sources). Existing internal callback behavior is preserved and executed first. (#10)ProcessTextAssetJob and FetchUrlAssetJob so ingestText() and ingestUrl() now use queued pipelines consistent with ingestFile(). (#13)docs/redundancy-audit-v0.2.4.md for non-blocking cleanup/debt discovered during this release cycle.Asset captures related chunk IDs and dispatches DeleteAssetVectorsJob after commit to prevent orphaned vectors. (#9)PgVectorStore::delete() now removes vectors by both direct document ID fallback and source_meta.chunk_id mapping, preventing stale vectors when IDs do not align one-to-one. (#9 overlap)AssetIndexed and AssetFailed are now deferred at the DB transaction boundary (after commit) instead of response-bound behavior. Removes race conditions where listeners executed before caller-linked ai_asset_id writes were visible. (#12)AssetIndexed and AssetFailed events fired after Ingestion::markState() reaches a terminal state. Unlike IngestionStateChanged they are dispatched via dispatch()->afterResponse() in a web request, so listeners run after the caller has committed its outer transaction and linked its domain rows to the asset. In CLI/queue-worker contexts they fire immediately (no response boundary to defer against). Eliminates the race condition where consumers had to fall back to matching assets by source_name because listeners ran before ai_asset_id was stored. IngestionStateChanged behavior is unchanged — fully backward compatible. (#5)ChatCompleted — inputTokens and outputTokens are now populated from the SDK's AgentResponse::$usage (non-streaming) and StreamableAgentResponse::$usage (streaming, available after the stream fully drains). Extraction uses duck typing against usage->promptTokens / usage->completionTokens, so test doubles and future SDK shape changes don't break the pipeline. durationMs is now also populated on the streaming path. (#6)EmbeddingProvider::embedManyWithUsage() — new optional method on the OpenAiEmbedding implementation that returns an EmbeddingResult (vectors + provider-reported token count). EmbedChunksJob detects it via method_exists() and emits real tokenCount on EmbeddingsCompleted when available; third-party providers that implement only the original contract continue to work unchanged (tokens reported as 0). The contract itself is untouched — fully backward compatible. (#6)LARAI_HEALTH_MIDDLEWARE env var accepts a pipe-separated list of middleware aliases (e.g. auth|throttle:60,1). Pipe is used instead of comma so Laravel's rate-limit parameter syntax (throttle:60,1 = 60 requests per minute) survives intact. Defaults to auth to preserve existing behavior. Set to empty to expose the endpoint publicly (for use behind a private network or ingress allowlist). Previously the middleware stack was hardcoded to ['auth'], blocking monitoring systems that authenticate via API key or IP allowlist. (#7)phpunit/phpunit and orchestra/testbench as dev dependencies, phpunit.xml.dist, and tests/TestCase.php base. First unit tests cover the new EmbeddingResult DTO and the ChatStreamResponse usage-passing contract. Run with composer test.OpenAiEmbedding::embedMany() was not actually batching — despite the method name, the implementation looped $this->embed($text) per chunk, issuing one HTTP request per input. It now delegates to embedManyWithUsage() which uses Embeddings::for($texts)->generate() with a batch size of 96. For a 200-chunk document this drops embedding API calls from 200 to 3. (#6)DoctorCommand::warn() visibility conflict — the warn() method introduced in v0.2.1 was declared private, but Illuminate\Console\Command defines a public warn(). PHP refuses to load the class, crashing the app on boot. Renamed to printWarn() (same pattern as the existing printFail()). Apps on v0.2.1 were unable to boot — upgrade to v0.2.2 immediately. (#4)ChatStreamResponse::getIterator() fell back to (string) $event for non-text stream events, which serialized StreamEnd, tool events, etc. as their JSON representation, breaking SSE consumers. Replaced with an extractDelta() helper that uses duck typing on a delta property — only TextDelta events emit text; metadata events are silently skipped. (#4)ingestFile(), ingestText(), and ingestUrl() now call $asset->load('ingestion') before returning, so callers always get the final pipeline state without needing $asset->fresh() (#2)scope column — v0.2.0 modified the original assets migration in-place, breaking users upgrading from v0.1.x. Added 2026_04_14_000008_add_scope_to_ai_assets_table.php with hasColumn guard (#2)larai:doctor and the health endpoint now show [WARN] Pdf parser — install smalot/pdfparser to enable and [WARN] Docx parser — install phpoffice/phpword to enable instead of letting users discover this via runtime crash (#2)toArray() called on already-plain array; now handles both return types defensivelyset_time_limit(0) and $timeout = 300; batch embedding via embedMany() reduces API calls 5xLARAI_RETRY_MAX and LARAI_RETRY_DELAY_MSmarkState('indexed') now fails if chunk_count === 0scope parameter on ingestText(), ingestFile(), and retrieve(). Pinecone uses metadata $eq filters; pgvector uses JSON WHERE clauses. Prevents data leaks across customers.EmbeddingProvider::embedMany(array $texts): array contract method. EmbedChunksJob now embeds all chunks in one pass instead of per-chunk loops.VectorStore::upsertMany(array $items): void contract method. Pinecone batches at 100 vectors/request.VectorStore::search() accepts array $scope parameter for tenant-isolated queries.sendMessage() now accepts custom Agent, history, scope, topK, and threshold parameters. Defaults to SupportAgent for backward compatibility.IngestionStateChanged event fired on every state transition for observability.larai:doctor --deep — live embedding probe that tests the actual API call and verifies vector dimensions match config.larai-kit.retry.max_attempts, retry.base_delay_ms, retry.on_status in config.scope column — added to ai_assets migration for multi-tenant filtering.Core Architecture
AiKitServiceProvider with auto-discovery, config merging, and smart contract bindingsFeatureDetector with tier-based graceful degradation (Tier 0-3)config/larai-kit.php with all tunables via .envMulti-Provider Support
LARAI_AI_PROVIDER)LARAI_VECTOR_STORE)Contracts
EmbeddingProvider — embed(), embedMany(), dimensions()VectorStore — upsert(), upsertMany(), search(), delete()FileStorage, DocumentParser, ChatProviderVector Store Implementations
PineconeVectorStore — HTTP-based with retry/backoff, zero extra packagesPgVectorStore — Eloquent-based with whereVectorSimilarTo()NullVectorStore — no-op fallback for graceful degradationIngestion Pipeline
IngestionService orchestrator with multi-tenant scope supportParseAssetJob, ChunkAssetJob, EmbedChunksJob, DeleteAssetVectorsJobChunker with configurable size and overlapTextParser, PdfParser, DocxParserRAG & Chat
RetrievalService for scoped semantic search via the VectorStore contractChatService with RAG context injection, source citations, and custom agent supportModels
Document, Asset, Chunk, Ingestion with relationships, casts, and eventsArtisan Commands
larai:install, larai:doctor (with --deep), larai:chat, make:larai-agent, make:larai-toolHow can I help you explore Laravel packages today?