jaybizzle/crawler-detect
Detect bots, crawlers, and spiders in PHP by matching User-Agent and HTTP_FROM headers. CrawlerDetect recognizes thousands of known user agents, is regularly updated, lets you check current or provided user agents, and can return the matched crawler name.
composer require jaybizzle/crawler-detect
use Jaybizzle\CrawlerDetect\CrawlerDetect;
$crawler = new CrawlerDetect();
if ($crawler->isCrawler()) {
// Handle crawler logic (e.g., serve lightweight content)
}
Create a middleware to block or redirect crawlers globally (now includes new signatures like ChatGPT-User):
php artisan make:middleware BlockCrawlers
Middleware Logic (explicitly check for new bots):
public function handle($request, Closure $next) {
$crawler = new CrawlerDetect();
if ($crawler->isCrawler(['ChatGPT-User', 'claude-web', 'Puppeteer'])) {
return redirect()->route('crawler-fallback');
}
return $next($request);
}
Register in app/Http/Kernel.php (unchanged):
protected $middleware = [
\App\Http\Middleware\BlockCrawlers::class,
];
Leverage Laravel’s Request object with new bot signatures:
use Illuminate\Http\Request;
public function index(Request $request) {
$crawler = new CrawlerDetect();
$crawler->setUserAgent($request->userAgent());
$crawler->setHttpFrom($request->server('HTTP_FROM', ''));
// Explicitly check for new high-confidence bots
if ($crawler->isCrawler(['ChatGPT-User', 'claude-web', 'iOS NetworkingExtension'])) {
return response()->view('crawler_fallback');
}
// Normal logic...
}
Centralize crawler detection with new signatures:
// app/Providers/CrawlerServiceProvider.php
public function register() {
$this->app->singleton(CrawlerDetect::class, function () {
$crawler = new CrawlerDetect();
$crawler->setUserAgent(request()->userAgent());
$crawler->setHttpFrom(request()->server('HTTP_FROM', ''));
// Add new signatures dynamically
$crawler->addCrawlers([
'ChatGPT-User' => '/ChatGPT-User\/[0-9.]+/i',
'claude-web' => '/claude-web\/[0-9.]+/i',
]);
return $crawler;
});
}
Use getMatches() to handle new bots like ChatGPT-User:
@if($crawler->isCrawler('ChatGPT-User'))
@include('schemas.chatgpt')
@elseif($crawler->isCrawler('claude-web'))
@include('schemas.claude')
@elseif($crawler->isCrawler())
@include('schemas.crawler_fallback')
@endif
Combine with Laravel’s throttle middleware for new bots:
Route::middleware(['throttle:60,1', 'crawler'])->group(function () {
Route::get('/api/data', 'DataController@index');
});
Custom Middleware (explicitly throttle new bots):
public function handle($request, Closure $next) {
$crawler = new CrawlerDetect();
$crawler->setUserAgent($request->userAgent());
if ($crawler->isCrawler(['ChatGPT-User', 'claude-web'])) {
return $next($request)->throttle(10, 1); // Stricter limit
}
return $next($request);
}
Log new bot signatures for monitoring:
if ($crawler->isCrawler()) {
\Log::channel('bot_audit')->info('Crawler detected', [
'bot' => $crawler->getMatches(),
'ip' => $request->ip(),
'user_agent' => $request->userAgent(),
'is_new_bot' => in_array('ChatGPT-User', $crawler->getMatches()) ||
in_array('claude-web', $crawler->getMatches()),
]);
}
False Positives for New Bots:
ChatGPT-User or claude-web may trigger false positives if misconfigured.getMatches() to verify:
if (in_array('ChatGPT-User', $crawler->getMatches())) {
// Handle with caution or whitelist
}
Amazon CloudFront was removed in v1.3.11 (PR #602). Update any custom logic referencing it.Header Spoofing for New Bots:
iOS NetworkingExtension) may spoof headers. Cross-validate with:
$crawler->setUserAgent($request->userAgent());
$crawler->setHttpFrom($request->server('HTTP_FROM', ''));
$crawler->setHttpVia($request->server('HTTP_VIA', ''));
Performance Overhead:
$cacheKey = 'crawler_' . md5($request->userAgent());
$isCrawler = Cache::remember($cacheKey, 60, function () use ($crawler) {
return $crawler->isCrawler(['ChatGPT-User', 'claude-web']);
});
dd($crawler->getMatches()); // Check for 'ChatGPT-User', 'claude-web', etc.
$crawler->isCrawler('ChatGPT-User/1.0'); // Test manually
composer update jaybizzle/crawler-detect
Custom Regex for New Bots:
Extend detection for new signatures (e.g., iOS NetworkingExtension):
// app/CrawlerDetect/CustomCrawlers.php
return [
'iOS NetworkingExtension' => '/iOS NetworkingExtension\/[0-9.]+/i',
'CustomChatBot' => '/CustomChatBot\/[0-9.]+/i',
];
Load in CrawlerServiceProvider:
$crawler->addCrawlers(require app_path('CrawlerDetect/CustomCrawlers.php'));
Whitelist/Blacklist for New Bots:
Override defaults in config/crawler-detect.php:
'blocked_bots' => ['ChatGPT-User', 'claude-web', 'Scrapy'],
'whitelisted_bots' => ['Googlebot', 'Bingbot'],
Middleware Logic (updated for new bots):
$matches = $crawler->getMatches();
if (count(array_intersect($matches, config('crawler-detect.blocked_bots'))) > 0) {
abort(403);
}
Laravel Integration: Use the official Laravel-Crawler-Detect package for:
$request->isCrawler(['ChatGPT-User'])).CrawlerDetect::isCrawler('claude-web')).@crawler(['ChatGPT-User'])).if ($crawler->isCrawler(['ChatGPT-User', 'claude-web'])) {
abort(403, 'AI crawlers not allowed');
}
if ($crawler->isCrawler(['iOS NetworkingExtension'])) {
return response()->view('lightweight')->header('Content-Type', 'text/plain');
}
ChatGPT-User/claude-web activity over time withHow can I help you explore Laravel packages today?