jaybizzle/crawler-detect
Detect bots, crawlers, and spiders in PHP by matching User-Agent and HTTP_FROM headers. CrawlerDetect recognizes thousands of known user agents, is regularly updated, lets you check current or provided user agents, and can return the matched crawler name.
ChatGPT-User, claude-web (critical for detecting LLMs used for data extraction).iOS NetworkingExtension (relevant for iOS-based scraping tools).^Amazon CloudFront (reduces misclassification of AWS infrastructure).ChatGPT-User or claude-web to block LLMs scraping proprietary data.iOS NetworkingExtension in APIs serving mobile traffic.CrawlerDetect::isCrawler()) works as before.getMatches() will now include the 24 new signatures by default.| Risk Area | Updated Mitigation Strategy |
|---|---|
| False Positives | New: Explicitly whitelist Amazon CloudFront in config if needed (removed from defaults). |
| AI/Scraping Arms Race | New: Monitor for ChatGPT-User/claude-web in high-value endpoints (e.g., /api/v1/data). |
| Signature Staleness | Improved: 24 new signatures in a single release suggest active maintenance; fork if niche bots are missing. |
| Performance | Unchanged: <1ms latency; new signatures are regex-based (no parsing overhead). |
ChatGPT-User) a known risk for your product?NetworkingExtension?Amazon CloudFront incorrectly blocked in prior versions?config/crawler-detect.php whitelist if using AWS CDN.PerplexityBot, MidjourneyScraper) not covered?getMatches() to log unknown UAs for analysis.if ($request->isCrawler(['ChatGPT-User']))).BotDetected events for AI scrapers to trigger custom responses (e.g., honeypot challenges).@inject('crawler', 'Jaybizzle\CrawlerDetect\CrawlerDetect')
@if($crawler->isCrawler('ChatGPT-User'))
{{-- Serve degraded content or CAPTCHA --}}
@endif
composer update jaybizzle/crawler-detect
$crawler->isCrawler('ChatGPT-User'); // Should return false for legitimate traffic.
DetectCrawlers middleware to handle AI scrapers:
if ($request->isCrawler(['ChatGPT-User', 'claude-web'])) {
event(new BotDetected($request, 'AI_Scraper'));
abort(429); // Rate limit
}
if ($crawler->isCrawler('ChatGPT-User')) {
\Log::channel('bot_audit')->info('AI Scraper detected', ['ua' => $request->userAgent()]);
}
config/crawler-detect.php.| Component | Updated Compatibility Notes |
|---|---|
| Laravel Versions | Unchanged (5.5–10.x). |
| PHP Versions | Unchanged (7.4–8.4). |
| New Signatures | No breaking changes; existing code works with expanded bot detection. |
| AWS CloudFront | Removed from defaults; add back to whitelist in config if needed. |
composer update jaybizzle/crawler-detect
isCrawler() for ChatGPT-User, claude-web, and iOS NetworkingExtension in staging.// app/Http/Middleware/DetectCrawlers.php
public function handle($request, Closure $next) {
if ($request->isCrawler(['ChatGPT-User', 'Scrapy'])) {
$this->logBotInteraction($request);
return response('Forbidden', 403);
}
return $next($request);
}
laravel-captcha).getMatches() for unknown bots and contribute signatures.getMatches() to audit new signatures and adjust whitelists.User-Agent headers for custom regex tuning (e.g., ChatGPT-User variants).User-Agent policies).| Scenario | Updated Mitigation |
|---|---|
| False Positives | New: Verify Amazon CloudFront is not blocked; whitelist if needed. |
| AI Scraper Evasion | New: Monitor for ChatGPT-User variants; extend middleware with IP reputation checks. |
| Signature Bloat | New: Use getMatches() to filter irrelevant signatures (e.g., mobile agents). |
| Middleware Conflicts | Unchanged: Test pipeline order in staging with php artisan route:list. |
ChatGPT-User).laravel-rate-limiting).getMatches() outputHow can I help you explore Laravel packages today?