spatie/laravel-site-search
Crawl and index your Laravel site for fast full-text search—like a private Google. Highly customizable crawling and indexing, with concurrent requests. Uses SQLite FTS5 by default (no external services), or Meilisearch for advanced features.
Installation:
composer require spatie/laravel-site-search
php artisan vendor:publish --provider="Spatie\SiteSearch\SiteSearchServiceProvider"
Publish the config file (config/site-search.php) and migration (php artisan migrate).
First Crawl:
Define a crawler in config/site-search.php:
'crawlers' => [
'default' => [
'url' => 'https://your-site.com',
'routes' => ['*'], // Crawl all routes
'depth' => 3, // Max depth of links to follow
],
],
Run the crawler:
php artisan site-search:crawl default
Searching:
Use the SiteSearch facade to query:
use Spatie\SiteSearch\Facades\SiteSearch;
$results = SiteSearch::search('query');
<form method="GET" action="/search">
<input type="text" name="q">
<button type="submit">Search</button>
</form>
Create a route and controller:
Route::get('/search', [SearchController::class, 'index']);
public function index(Request $request) {
$results = SiteSearch::search($request->q);
return view('search.results', compact('results'));
}
Custom Crawlers: Define multiple crawlers for different parts of your site (e.g., admin vs. public):
'crawlers' => [
'public' => [
'url' => 'https://your-site.com',
'routes' => ['/blog/*', '/products/*'],
'depth' => 2,
'exclude_routes' => ['/admin/*'],
],
'admin' => [
'url' => 'https://your-site.com/admin',
'routes' => ['/admin/*'],
'authenticated' => true, // Use middleware
],
],
Dynamic Routes: Use route names or patterns to control crawling scope:
'routes' => ['blog.post', 'products.show'], // Named routes
Partial Indexing: Schedule crawls to run periodically (e.g., nightly) using Laravel tasks:
php artisan schedule:run
Add to app/Console/Kernel.php:
$schedule->command('site-search:crawl default')->daily();
Incremental Crawling:
Use the only_update_existing_pages option to avoid re-crawling unchanged pages:
'crawlers' => [
'default' => [
'only_update_existing_pages' => true,
],
],
Result Formatting:
Extend the SearchResult model to add custom fields:
namespace App\Models;
use Spatie\SiteSearch\Models\SearchResult;
class CustomSearchResult extends SearchResult
{
protected $appends = ['custom_field'];
public function getCustomFieldAttribute()
{
return $this->attributes['custom_field'] ?? null;
}
}
Update the config to use your model:
'search_result_model' => App\Models\CustomSearchResult::class,
Highlighting Snippets:
Use the highlight method to get snippets of matched content:
$results = SiteSearch::search('query')->highlight();
composer require spatie/laravel-meilisearch-driver
Update config:
'driver' => 'meilisearch',
'meilisearch' => [
'host' => env('MEILISEARCH_HOST', 'http://localhost:7700'),
'index' => 'site_search',
],
Re-run crawls to migrate data.Log Crawling Issues: Enable debug mode in config:
'debug' => env('APP_DEBUG', false),
Check logs for failed requests or blocked URLs.
Rate Limiting: If crawling fails due to rate limits, adjust concurrency:
'crawler_options' => [
'concurrency' => 5, // Default is 10
],
Exclude Static Assets: Filter out non-content URLs in the crawler config:
'exclude_routes' => ['/assets/*', '/images/*', '/js/*', '/css/*'],
Optimize SQLite: For large sites, ensure your SQLite database is optimized:
php artisan site-search:optimize
Middleware for Authenticated Crawling:
Use the middleware option to apply middleware during crawling:
'crawlers' => [
'admin' => [
'middleware' => ['auth.admin'],
],
],
Custom HTTP Client: Override the default HTTP client for proxies or custom headers:
'http_client' => \Illuminate\Http\Client\PendingRequest::macro('custom', function () {
return $this->withOptions(['verify' => false]); // Example
}),
Custom Crawler Logic:
Extend the Spatie\SiteSearch\Crawlers\Crawler class to add pre/post-processing:
namespace App\Crawlers;
use Spatie\SiteSearch\Crawlers\Crawler;
class CustomCrawler extends Crawler
{
protected function shouldCrawlUrl(string $url): bool
{
// Custom logic here
return parent::shouldCrawlUrl($url);
}
}
Register in config:
'crawler_class' => App\Crawlers\CustomCrawler::class,
Search Result Transformers:
Use the transformer option to modify results before returning:
'transformer' => function ($results) {
return $results->map(function ($result) {
$result->score = $result->score * 1.5; // Boost scores
return $result;
});
},
Circular References:
Avoid infinite loops by setting a reasonable depth and monitoring logs.
Dynamic Content: Ensure JavaScript-rendered content is crawled by using a headless browser (e.g., Puppeteer) via a custom crawler.
Case Sensitivity: SQLite FTS5 is case-insensitive by default, but Meilisearch may require explicit configuration for case sensitivity.
How can I help you explore Laravel packages today?