Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Crawler Laravel Package

spatie/crawler

PHP web crawler that discovers links concurrently via Guzzle, with optional JavaScript rendering powered by Chrome/Puppeteer. Configure depth, internal-only rules, and callbacks for per-page handling, plus a fake mode to test crawl logic without real HTTP requests.

View on GitHub
Deep Wiki
Context7

title: Your first crawl weight: 1

The simplest way to start crawling is to use closure callbacks:

use Spatie\Crawler\Crawler;
use Spatie\Crawler\CrawlResponse;

Crawler::create('https://example.com')
    ->onCrawled(function (string $url, CrawlResponse $response) {
        echo "{$url}: {$response->status()}\n";
    })
    ->start();

The following callbacks are available:

use GuzzleHttp\Exception\RequestException;
use Spatie\Crawler\CrawlProgress;
use Spatie\Crawler\Enums\FinishReason;
use Spatie\Crawler\Enums\ResourceType;

Crawler::create('https://example.com')
    ->onWillCrawl(function (string $url, ?string $linkText, ?ResourceType $resourceType) {
        // called before a URL is crawled
    })
    ->onCrawled(function (string $url, CrawlResponse $response, CrawlProgress $progress) {
        // called for every successfully crawled URL
    })
    ->onFailed(function (string $url, RequestException $e, CrawlProgress $progress, ?string $foundOnUrl, ?string $linkText, ?ResourceType $resourceType) {
        // called when a URL could not be crawled
    })
    ->onFinished(function (FinishReason $reason, CrawlProgress $progress) {
        // called when the whole crawl is complete
    })
    ->start();

Each callback (except onWillCrawl) receives a CrawlProgress object with live crawl statistics. See tracking progress for details.

The onFinished callback also receives a FinishReason enum that tells you why the crawl stopped.

You can register multiple callbacks of the same type. They will all be called in the order they were added.

use Spatie\Crawler\Crawler;
use Spatie\Crawler\CrawlResponse;

Crawler::create('https://example.com')
    ->onCrawled(function (string $url, CrawlResponse $response) {
        // first callback: log to database
    })
    ->onCrawled(function (string $url, CrawlResponse $response) {
        // second callback: send notification
    })
    ->start();
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
milito/query-filter
apiboxsym/user-bundle
apiboxsym/health-check-bundle
jayeshmepani/jpl-moshier-ephemeris-php
elnasnato/laraliveui
labrodev/rest-sdk
sampaui/sampaui
babelqueue/php-sdk
facebook/capi-param-builder-php
babelqueue/symfony
hamzi/corewatch
minionfactory/raw-hydrator
hexters/coinpayment
rjcodes/rjcms
act-training/laravel-permissions-manager
alimarchal/laravel-chart-of-accounts
babenkoivan/elastic-scout-driver
mkwebdesign/filament-watchdog-v5
renatomarinho/laravel-page-speed
zedmagdy/filament-business-hours