Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Crawler Laravel Package

spatie/crawler

PHP web crawler that discovers links concurrently via Guzzle, with optional JavaScript rendering powered by Chrome/Puppeteer. Configure depth, internal-only rules, and callbacks for per-page handling, plus a fake mode to test crawl logic without real HTTP requests.

View on GitHub
Deep Wiki
Context7

title: Crawling across requests weight: 4

You can use limitPerExecution() to break up long running crawls across multiple HTTP requests. This is useful in serverless environments or when you want to avoid timeouts. See setting crawl limits for all available limit options.

Initial request

To start crawling across different requests, create a queue instance and pass it to the crawler. The crawler will fill the queue as pages are processed and new URLs are discovered. After the crawler finishes (because it hit the per execution limit), serialize and store the queue.

use Spatie\Crawler\Crawler;
use Spatie\Crawler\CrawlQueues\ArrayCrawlQueue;

$queue = new ArrayCrawlQueue(); // or your custom queue

// Crawl the first batch of URLs
Crawler::create('https://example.com')
    ->crawlQueue($queue)
    ->limitPerExecution(10)
    ->start();

// Serialize and store the queue for the next request
$serializedQueue = serialize($queue);

Subsequent requests

For following requests, unserialize the queue and pass it to the crawler:

use Spatie\Crawler\Crawler;

$queue = unserialize($serializedQueue);

// Crawl the next batch of URLs
Crawler::create('https://example.com')
    ->crawlQueue($queue)
    ->limitPerExecution(10)
    ->start();

// Serialize and store the queue again
$serializedQueue = serialize($queue);

The behavior is based on the information in the queue. Only if the same queue instance is passed will the crawler continue where it left off. When a completely new queue is passed, the limits of previous crawls won't apply.

A more detailed example can be found in this repository.

Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
emuniq/filament-browser-notifications
syriable/filament-translator
hungnm28/livewire-form
wenprise/eloquent
crudly/encrypted
fadion/bouncy
cuci/prototurk-sdk
gos/pubsub-router-bundle
cuci/prototurk-sdk-symfony
clementtalleu/easyadmin-markdown-bundle
codeflextech/permission-manager
karnoweb/livewire-datepicker
sayedenam/sayed-dashboard
milito/query-filter
apiboxsym/user-bundle
apiboxsym/health-check-bundle
jayeshmepani/jpl-moshier-ephemeris-php
elnasnato/laraliveui
labrodev/rest-sdk
sampaui/sampaui