Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Ai Click House Store Laravel Package

symfony/ai-click-house-store

ClickHouse vector store integration for Symfony AI Store. Store and query embeddings in ClickHouse using distance functions and ANN/vector indexes for fast similarity search. Links to ClickHouse docs plus Symfony AI contributing and issue tracker.

View on GitHub
Deep Wiki
Context7

Getting Started

Minimal Setup

  1. Install the Package:

    composer require symfony/ai-click-house-store
    
  2. Configure ClickHouse Connection: Add to config/services.php or config/clickhouse.php:

    'clickhouse' => [
        'dsn' => 'http://clickhouse:8123/ai_vectors?database=default&index=1',
        // OR for native driver:
        // 'dsn' => 'clickhouse://user:password@clickhouse:9000/ai_vectors',
    ],
    
  3. Define a ClickHouse Table: Run this SQL in ClickHouse (adjust embedding size to your vector dimensionality, e.g., 768 for embeddings):

    CREATE TABLE vectors (
        id UInt64,
        embedding Array(Float32),  -- Must match your vector size (e.g., 768)
        metadata String,
        INDEX ann_index embedding TYPE ann(768) GRANULARITY=3 GRAPH_SIZE=1000
    ) ENGINE = MergeTree() ORDER BY id;
    
  4. First Usage:

    use Symfony\Component\AI\Store\ClickHouseStore;
    use Symfony\Component\AI\Store\Query;
    
    $store = new ClickHouseStore('http://clickhouse:8123/ai_vectors');
    $vector = [0.1, 0.2, ..., 0.768]; // Your 768-dim vector
    $store->add('doc1', $vector, ['source' => 'user_upload']);
    
    // Query with filtering
    $query = new Query($vector, 5); // Find 5 nearest neighbors
    $query->filter(['source' => 'user_upload']);
    $results = $store->find($query);
    

First Use Case: Semantic Search

// In a Laravel controller or service
public function searchDocuments(string $queryText, int $limit = 3) {
    // 1. Generate embedding (e.g., using Symfony AI's embedder)
    $embedding = $this->embeddingService->embed($queryText);

    // 2. Query ClickHouse store
    $query = new Query($embedding, $limit);
    $query->filter(['type' => 'document']); // Optional metadata filter
    $results = $this->store->find($query);

    // 3. Return formatted results
    return array_map(fn ($item) => [
        'id' => $item->id,
        'distance' => $item->distance,
        'metadata' => $item->metadata,
    ], $results);
}

Implementation Patterns

Workflow: Batch Ingestion

// Laravel Artisan Command for bulk import
use Symfony\Component\AI\Store\ClickHouseStore;

class ImportVectorsCommand extends Command {
    protected $signature = 'ai:import-vectors {file}';
    protected $description = 'Import vectors from JSON to ClickHouse';

    public function handle() {
        $store = new ClickHouseStore('http://clickhouse:8123/ai_vectors');
        $data = json_decode(file_get_contents($this->argument('file')), true);

        foreach ($data as $item) {
            $store->add($item['id'], $item['embedding'], $item['metadata'] ?? []);
        }

        $this->info('Import completed!');
    }
}

Workflow: Hybrid Search (Vector + Metadata)

// Combine vector similarity with SQL filtering
public function hybridSearch(array $vector, array $filters, int $limit = 5) {
    $query = new Query($vector, $limit);
    $query->filter($filters); // e.g., ['category' => 'tech', 'date' => '>2023-01-01']

    $results = $this->store->find($query);

    return $results;
}

Integration with Laravel Ecosystem

  1. Service Provider Binding:

    // app/Providers/AppServiceProvider.php
    public function register() {
        $this->app->bind(\Symfony\Component\AI\Store\StoreInterface::class, function ($app) {
            return new ClickHouseStore(
                config('clickhouse.dsn'),
                config('clickhouse.options', [])
            );
        });
    }
    
  2. Event-Driven Updates:

    // Listen to model events and update vectors
    use Illuminate\Database\Eloquent\Model;
    
    Model::observe(VectorObserver::class);
    
    class VectorObserver {
        public function saved(Model $model) {
            if ($model->isDirty('content')) {
                $embedding = $this->generateEmbedding($model->content);
                $this->store->add($model->id, $embedding, $model->metadata());
            }
        }
    }
    
  3. Caching Layer:

    // Cache results for frequent queries
    public function findCached(Query $query, int $ttl = 3600) {
        $cacheKey = 'ai:query:' . md5(serialize($query));
        return cache()->remember($cacheKey, $ttl, function () use ($query) {
            return $this->store->find($query);
        });
    }
    

Gotchas and Tips

Pitfalls

  1. Vector Dimensionality Mismatch:

    • Error: ClickHouseException: Vector size mismatch (expected 768, got 384).
    • Fix: Ensure your embedding column in ClickHouse matches the dimensionality of all inserted vectors. Use Array(Float32) with the correct size (e.g., Array(Float32, 768)).
  2. ANN Index Not Used:

    • Symptom: Slow queries despite having an ANN index.
    • Debug: Check if the index is properly configured:
      SELECT name, type FROM system.indexes WHERE table = 'vectors' AND name = 'ann_index';
      
    • Fix: Recreate the index with optimal parameters:
      ALTER TABLE vectors DROP INDEX ann_index;
      ALTER TABLE vectors ADD INDEX ann_index embedding TYPE ann(768) GRANULARITY=3 GRAPH_SIZE=1000;
      
  3. Metadata Filtering Issues:

    • Error: ClickHouseException: Unknown column 'metadata.category' in 'where clause'.
    • Fix: Ensure metadata is stored as a nested structure in ClickHouse. Example schema:
      CREATE TABLE vectors (
          id UInt64,
          embedding Array(Float32),
          metadata JSON,  -- Store metadata as JSON for flexible querying
          INDEX ann_index embedding TYPE ann(768)
      ) ENGINE = MergeTree();
      
    • Query with:
      $query->filter(['metadata.category' => 'tech']);
      
  4. Connection Timeouts:

    • Symptom: Connection refused or Network timeout.
    • Fix: Configure retries in the ClickHouse client:
      $store = new ClickHouseStore('http://clickhouse:8123/ai_vectors', [
          'timeout' => 10.0,
          'retry' => 3,
      ]);
      

Debugging Tips

  1. Enable ClickHouse Query Logging: Add to config/clickhouse.php:

    'options' => [
        'logger' => function ($query, $params) {
            \Log::debug("ClickHouse Query: {$query}", $params);
        },
    ],
    
  2. Profile Slow Queries: Use ClickHouse’s EXPLAIN to analyze query plans:

    EXPLAIN SELECT * FROM vectors ORDER BY vector_distance(embedding, [0.1, 0.2, ...]) LIMIT 5;
    
  3. Monitor ANN Index Performance: Check index usage metrics:

    SELECT * FROM system.asynchronous_metrics WHERE name LIKE '%ann%';
    

Extension Points

  1. Custom Distance Functions: Override the default distance function in your store class:

    use Symfony\Component\AI\Store\ClickHouseStore;
    
    class CustomClickHouseStore extends ClickHouseStore {
        protected function getDistanceFunction(): string {
            return 'cosineDistance'; // Use cosine instead of L2
        }
    }
    
  2. Dynamic Indexing: Implement runtime index selection based on query patterns:

    public function find(Query $query) {
        if ($query->limit > 100) {
            // Use a coarser index for large batches
            $this->useIndex('coarse_ann_index');
        }
        return parent::find($query);
    }
    
  3. Batch Operations: Extend for bulk operations (not natively supported):

    public function batchAdd(array $items) {
        $sql = 'INSERT INTO vectors (id, embedding, metadata) VALUES ';
        $values = [];
        foreach ($items as $item) {
            $values[] = "('{$item['id
    
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
monarobase/country-list
nasirkhan/laravel-sharekit
directorytree/privacy-filter-classifier
directorytree/privacy-filter
datacore/hub-sdk
develia/commons
cuci/prototurk-sdk
cuci/prototurk-sdk-symfony
develia/geo-bundle
dreamzy/livewire-charts
touchestate-sdk/php-sdk
22h/doctrine-garbage-collection-bundle
agtp/agtp-php
agtp/mod-php
splash/sonata-admin
splash/metadata
splash/openapi
splash/scopes
splash/toolkit
testo/output-teamcity