Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Crawler Laravel Package

spatie/crawler

PHP web crawler that discovers links concurrently via Guzzle, with optional JavaScript rendering powered by Chrome/Puppeteer. Configure depth, internal-only rules, and callbacks for per-page handling, plus a fake mode to test crawl logic without real HTTP requests.

View on GitHub
Deep Wiki
Context7

title: Tracking progress weight: 6

The crawler provides real-time progress tracking through the CrawlProgress object and reports why a crawl stopped through the FinishReason enum.

CrawlProgress

Every onCrawled, onFailed, and onFinished callback receives a CrawlProgress object with the following properties:

use Spatie\Crawler\CrawlProgress;

// Available on every CrawlProgress instance:
$progress->urlsCrawled;   // int (number of URLs successfully crawled)
$progress->urlsFailed;    // int (number of URLs that failed)
$progress->urlsProcessed; // int (urlsCrawled + urlsFailed)
$progress->urlsFound;     // int (total URLs added to the queue)
$progress->urlsPending;   // int (URLs not yet processed)

Here's an example that logs progress during a crawl:

use Spatie\Crawler\Crawler;
use Spatie\Crawler\CrawlProgress;
use Spatie\Crawler\CrawlResponse;

Crawler::create('https://example.com')
    ->onCrawled(function (string $url, CrawlResponse $response, CrawlProgress $progress) {
        echo "[{$progress->urlsProcessed}/{$progress->urlsFound}] {$url}\n";
    })
    ->start();

FinishReason

The start() method returns a FinishReason enum that tells you why the crawl stopped:

use Spatie\Crawler\Crawler;
use Spatie\Crawler\Enums\FinishReason;

$reason = Crawler::create('https://example.com')
    ->limit(100)
    ->start();

match ($reason) {
    FinishReason::Completed => 'All URLs have been crawled',
    FinishReason::CrawlLimitReached => 'Stopped because the crawl limit was reached',
    FinishReason::TimeLimitReached => 'Stopped because the time limit was reached',
    FinishReason::Interrupted => 'Stopped by a signal (SIGINT/SIGTERM)',
};

The onFinished callback also receives the FinishReason:

use Spatie\Crawler\CrawlProgress;
use Spatie\Crawler\Enums\FinishReason;

Crawler::create('https://example.com')
    ->limit(100)
    ->onFinished(function (FinishReason $reason, CrawlProgress $progress) {
        echo "Crawl finished: {$reason->value}\n";
        echo "Crawled {$progress->urlsCrawled} URLs, {$progress->urlsFailed} failed\n";
    })
    ->start();

Using progress in observers

Observer classes receive CrawlProgress and FinishReason through their method signatures:

use Spatie\Crawler\CrawlObservers\CrawlObserver;
use Spatie\Crawler\CrawlProgress;
use Spatie\Crawler\CrawlResponse;
use Spatie\Crawler\Enums\FinishReason;

class ProgressLogger extends CrawlObserver
{
    public function crawled(
        string $url,
        CrawlResponse $response,
        CrawlProgress $progress,
    ): void {
        echo "[{$progress->urlsProcessed}/{$progress->urlsFound}] {$url}\n";
    }

    public function finishedCrawling(FinishReason $reason, CrawlProgress $progress): void
    {
        echo "Done ({$reason->value}): {$progress->urlsCrawled} crawled, {$progress->urlsFailed} failed\n";
    }
}
Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
jayeshmepani/jpl-moshier-ephemeris-php
elnasnato/laraliveui
labrodev/rest-sdk
sampaui/sampaui
babelqueue/php-sdk
facebook/capi-param-builder-php
babelqueue/symfony
hamzi/corewatch
minionfactory/raw-hydrator
hexters/coinpayment
rjcodes/rjcms
act-training/laravel-permissions-manager
alimarchal/laravel-chart-of-accounts
babenkoivan/elastic-scout-driver
mkwebdesign/filament-watchdog-v5
renatomarinho/laravel-page-speed
zedmagdy/filament-business-hours
renatovdemoura/blade-elements-ui
devgeek/beacon-admin
benjamin-rqt/data-watcher-bundle