Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Crawler Detect Laravel Package

jaybizzle/crawler-detect

Detect bots, crawlers, and spiders in PHP by matching User-Agent and HTTP_FROM headers. CrawlerDetect recognizes thousands of known user agents, is regularly updated, lets you check current or provided user agents, and can return the matched crawler name.

View on GitHub
Deep Wiki
Context7

Getting Started

Minimal Setup

  1. Install via Composer (unchanged):
    composer require jaybizzle/crawler-detect
    
  2. Basic Usage in Laravel (unchanged):
    use Jaybizzle\CrawlerDetect\CrawlerDetect;
    
    $crawler = new CrawlerDetect();
    if ($crawler->isCrawler()) {
        // Handle crawler logic (e.g., serve lightweight content)
    }
    

First Use Case: Middleware Integration (Updated)

Create a middleware to block or redirect crawlers globally (now includes new signatures like ChatGPT-User):

php artisan make:middleware BlockCrawlers

Middleware Logic (explicitly check for new bots):

public function handle($request, Closure $next) {
    $crawler = new CrawlerDetect();
    if ($crawler->isCrawler(['ChatGPT-User', 'claude-web', 'Puppeteer'])) {
        return redirect()->route('crawler-fallback');
    }
    return $next($request);
}

Register in app/Http/Kernel.php (unchanged):

protected $middleware = [
    \App\Http\Middleware\BlockCrawlers::class,
];

Implementation Patterns

1. Request-Based Detection (Updated)

Leverage Laravel’s Request object with new bot signatures:

use Illuminate\Http\Request;

public function index(Request $request) {
    $crawler = new CrawlerDetect();
    $crawler->setUserAgent($request->userAgent());
    $crawler->setHttpFrom($request->server('HTTP_FROM', ''));

    // Explicitly check for new high-confidence bots
    if ($crawler->isCrawler(['ChatGPT-User', 'claude-web', 'iOS NetworkingExtension'])) {
        return response()->view('crawler_fallback');
    }
    // Normal logic...
}

2. Service Provider Integration (Updated)

Centralize crawler detection with new signatures:

// app/Providers/CrawlerServiceProvider.php
public function register() {
    $this->app->singleton(CrawlerDetect::class, function () {
        $crawler = new CrawlerDetect();
        $crawler->setUserAgent(request()->userAgent());
        $crawler->setHttpFrom(request()->server('HTTP_FROM', ''));
        // Add new signatures dynamically
        $crawler->addCrawlers([
            'ChatGPT-User' => '/ChatGPT-User\/[0-9.]+/i',
            'claude-web' => '/claude-web\/[0-9.]+/i',
        ]);
        return $crawler;
    });
}

3. Dynamic Content Delivery (Updated)

Use getMatches() to handle new bots like ChatGPT-User:

@if($crawler->isCrawler('ChatGPT-User'))
    @include('schemas.chatgpt')
@elseif($crawler->isCrawler('claude-web'))
    @include('schemas.claude')
@elseif($crawler->isCrawler())
    @include('schemas.crawler_fallback')
@endif

4. API Rate Limiting (Updated)

Combine with Laravel’s throttle middleware for new bots:

Route::middleware(['throttle:60,1', 'crawler'])->group(function () {
    Route::get('/api/data', 'DataController@index');
});

Custom Middleware (explicitly throttle new bots):

public function handle($request, Closure $next) {
    $crawler = new CrawlerDetect();
    $crawler->setUserAgent($request->userAgent());
    if ($crawler->isCrawler(['ChatGPT-User', 'claude-web'])) {
        return $next($request)->throttle(10, 1); // Stricter limit
    }
    return $next($request);
}

5. Logging & Analytics (Updated)

Log new bot signatures for monitoring:

if ($crawler->isCrawler()) {
    \Log::channel('bot_audit')->info('Crawler detected', [
        'bot' => $crawler->getMatches(),
        'ip' => $request->ip(),
        'user_agent' => $request->userAgent(),
        'is_new_bot' => in_array('ChatGPT-User', $crawler->getMatches()) ||
                       in_array('claude-web', $crawler->getMatches()),
    ]);
}

Gotchas and Tips

Pitfalls (Updated)

  1. False Positives for New Bots:

    • Bots like ChatGPT-User or claude-web may trigger false positives if misconfigured.
    • Fix: Use getMatches() to verify:
      if (in_array('ChatGPT-User', $crawler->getMatches())) {
          // Handle with caution or whitelist
      }
      
    • Note: Amazon CloudFront was removed in v1.3.11 (PR #602). Update any custom logic referencing it.
  2. Header Spoofing for New Bots:

    • New bots (e.g., iOS NetworkingExtension) may spoof headers. Cross-validate with:
      $crawler->setUserAgent($request->userAgent());
      $crawler->setHttpFrom($request->server('HTTP_FROM', ''));
      $crawler->setHttpVia($request->server('HTTP_VIA', ''));
      
  3. Performance Overhead:

    • The 22 new signatures in v1.3.11 add minimal overhead (~1–3ms per request). Cache results for high-traffic APIs:
      $cacheKey = 'crawler_' . md5($request->userAgent());
      $isCrawler = Cache::remember($cacheKey, 60, function () use ($crawler) {
          return $crawler->isCrawler(['ChatGPT-User', 'claude-web']);
      });
      

Debugging Tips (Updated)

  1. Inspect New Matches:
    dd($crawler->getMatches()); // Check for 'ChatGPT-User', 'claude-web', etc.
    
  2. Test Custom UAs for New Bots:
    $crawler->isCrawler('ChatGPT-User/1.0'); // Test manually
    
  3. Update Regularly: The package now includes 22 new high-confidence signatures. Ensure your Laravel app pulls the latest:
    composer update jaybizzle/crawler-detect
    

Extension Points (Updated)

  1. Custom Regex for New Bots: Extend detection for new signatures (e.g., iOS NetworkingExtension):

    // app/CrawlerDetect/CustomCrawlers.php
    return [
        'iOS NetworkingExtension' => '/iOS NetworkingExtension\/[0-9.]+/i',
        'CustomChatBot' => '/CustomChatBot\/[0-9.]+/i',
    ];
    

    Load in CrawlerServiceProvider:

    $crawler->addCrawlers(require app_path('CrawlerDetect/CustomCrawlers.php'));
    
  2. Whitelist/Blacklist for New Bots: Override defaults in config/crawler-detect.php:

    'blocked_bots' => ['ChatGPT-User', 'claude-web', 'Scrapy'],
    'whitelisted_bots' => ['Googlebot', 'Bingbot'],
    

    Middleware Logic (updated for new bots):

    $matches = $crawler->getMatches();
    if (count(array_intersect($matches, config('crawler-detect.blocked_bots'))) > 0) {
        abort(403);
    }
    
  3. Laravel Integration: Use the official Laravel-Crawler-Detect package for:

    • Request macro ($request->isCrawler(['ChatGPT-User'])).
    • Facade (CrawlerDetect::isCrawler('claude-web')).
    • Blade directives (@crawler(['ChatGPT-User'])).

Pro Tips (Updated)

  • Block AI Crawlers: Explicitly target new AI-driven bots:
    if ($crawler->isCrawler(['ChatGPT-User', 'claude-web'])) {
        abort(403, 'AI crawlers not allowed');
    }
    
  • Serve Lightweight Content to New Bots:
    if ($crawler->isCrawler(['iOS NetworkingExtension'])) {
        return response()->view('lightweight')->header('Content-Type', 'text/plain');
    }
    
  • Monitor New Bot Trends: Use Laravel Telescope to track ChatGPT-User/claude-web activity over time with
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
hamzi/corewatch
minionfactory/raw-hydrator
hexters/coinpayment
rjcodes/rjcms
act-training/laravel-permissions-manager
alimarchal/laravel-chart-of-accounts
babenkoivan/elastic-scout-driver
mkwebdesign/filament-watchdog-v5
renatomarinho/laravel-page-speed
zedmagdy/filament-business-hours
renatovdemoura/blade-elements-ui
devgeek/beacon-admin
benjamin-rqt/data-watcher-bundle
atriumphp/atrium
sandermuller/package-boost-laravel
sandermuller/boost-skills
redaxo/core
yusufgenc/filament-api-forge
l3aro/rating-star-for-filament
leek/filament-subtenant-scope