spatie/crawler
PHP web crawler that discovers links concurrently via Guzzle, with optional JavaScript rendering powered by Chrome/Puppeteer. Configure depth, internal-only rules, and callbacks for per-page handling, plus a fake mode to test crawl logic without real HTTP requests.
This package provides a powerful, easy to use class to crawl links on a website. Under the hood, Guzzle promises are used to crawl multiple URLs concurrently.
Because the crawler can execute JavaScript, it can crawl JavaScript rendered sites. Under the hood, Chrome and Puppeteer are used to power this feature.
Here's a quick example:
use Spatie\Crawler\Crawler;
use Spatie\Crawler\CrawlResponse;
Crawler::create('https://example.com')
->onCrawled(function (string $url, CrawlResponse $response) {
echo "{$url}: {$response->status()}\n";
})
->start();
Or collect all URLs on a site:
$urls = Crawler::create('https://example.com')
->internalOnly()
->depth(3)
->foundUrls();
How can I help you explore Laravel packages today?