How do I install spatie/crawler in a Laravel project?

Run `composer require spatie/crawler` in your project directory. The package integrates seamlessly with Laravel’s service container and requires no additional configuration unless using JavaScript rendering, which needs Chrome/Puppeteer installed.

Can I crawl JavaScript-rendered sites (e.g., React/Vue SPAs) with this package?

Yes, the crawler supports dynamic content via Puppeteer/Chrome. Enable it by calling `withBrowser()` on the crawler instance, but note this adds overhead and requires Chrome to be installed in your environment.

How do I limit the crawl depth or scope to internal links only?

Use `depth($maxDepth)` to restrict crawl depth and `internalOnly()` to focus on links within the same domain. For example, `Crawler::create('https://example.com')->internalOnly()->depth(3)->start()`.

Is spatie/crawler compatible with Laravel 10+?

Yes, the package supports Laravel 9 and 10. Check the [GitHub repo](https://github.com/spatie/crawler) for the latest version, as minor updates may introduce breaking changes for newer Laravel releases.

How can I test crawl logic without hitting real APIs?

Use the `fake()` method to simulate responses. Pass an associative array of URLs to their HTML content, like `->fake(['https://example.com' => ' ... '])`. This works with `foundUrls()` and `onCrawled` callbacks.

What’s the best way to handle large-scale crawls (e.g., 10K+ pages) in Laravel?

For scalability, pair the crawler with Laravel Queues (e.g., `shouldQueue()`) and distribute jobs across workers using Horizon or Redis. The crawler itself is single-process, so external queue systems are needed for distributed crawling.

Does spatie/crawler respect robots.txt or rate limits?

The package does not enforce `robots.txt` or rate limits by default. Use Guzzle middleware (e.g., `RateLimitMiddleware`) or custom observers to implement these rules, especially for external sites.

How do I store crawled URLs or responses in a Laravel database?

The crawler returns raw data (e.g., `foundUrls()` or `CrawlResponse`). Manually persist this to a database using Eloquent or a custom table. Example: `Url::create(['path' => $url, 'status' => $response->status()])`.

Are there alternatives to spatie/crawler for Laravel?

Yes, consider `symfony/panther` (headless browser testing) or `laravel-web-scraper` for simpler scraping. For distributed crawling, `spatie/crawler` paired with queues is more robust than standalone tools like `Goutte` (PHP-only, no JS).

How do I debug failed crawls or timeouts in production?

Use observers to log errors (e.g., `->onError(function ($url, $exception) { Log::error($exception); })`). For timeouts, adjust Guzzle’s timeout settings or implement retry logic with `shouldRetryCallback()`.

Weave Code

Code Weaver

Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.

Crawler Laravel Package

spatie/crawler

PHP web crawler with concurrent requests via Guzzle. Crawl internal/external links, limit depth, and collect discovered URLs. Supports JavaScript-rendered sites using Chrome/Puppeteer. Includes faking responses for testing crawl logic without real HTTP calls.

View on GitHub

Deep Wiki

Context7

https://spatie.be/docs/crawler

Frequently asked questions about Crawler

How do I install spatie/crawler in a Laravel project?: Run `composer require spatie/crawler` in your project directory. The package integrates seamlessly with Laravel’s service container and requires no additional configuration unless using JavaScript rendering, which needs Chrome/Puppeteer installed.
Can I crawl JavaScript-rendered sites (e.g., React/Vue SPAs) with this package?: Yes, the crawler supports dynamic content via Puppeteer/Chrome. Enable it by calling `withBrowser()` on the crawler instance, but note this adds overhead and requires Chrome to be installed in your environment.
How do I limit the crawl depth or scope to internal links only?: Use `depth($maxDepth)` to restrict crawl depth and `internalOnly()` to focus on links within the same domain. For example, `Crawler::create('https://example.com')->internalOnly()->depth(3)->start()`.
Is spatie/crawler compatible with Laravel 10+?: Yes, the package supports Laravel 9 and 10. Check the [GitHub repo](https://github.com/spatie/crawler) for the latest version, as minor updates may introduce breaking changes for newer Laravel releases.
How can I test crawl logic without hitting real APIs?: Use the `fake()` method to simulate responses. Pass an associative array of URLs to their HTML content, like `->fake(['https://example.com' => '<html>...</html>'])`. This works with `foundUrls()` and `onCrawled` callbacks.
What’s the best way to handle large-scale crawls (e.g., 10K+ pages) in Laravel?: For scalability, pair the crawler with Laravel Queues (e.g., `shouldQueue()`) and distribute jobs across workers using Horizon or Redis. The crawler itself is single-process, so external queue systems are needed for distributed crawling.
Does spatie/crawler respect robots.txt or rate limits?: The package does not enforce `robots.txt` or rate limits by default. Use Guzzle middleware (e.g., `RateLimitMiddleware`) or custom observers to implement these rules, especially for external sites.
How do I store crawled URLs or responses in a Laravel database?: The crawler returns raw data (e.g., `foundUrls()` or `CrawlResponse`). Manually persist this to a database using Eloquent or a custom table. Example: `Url::create(['path' => $url, 'status' => $response->status()])`.
Are there alternatives to spatie/crawler for Laravel?: Yes, consider `symfony/panther` (headless browser testing) or `laravel-web-scraper` for simpler scraping. For distributed crawling, `spatie/crawler` paired with queues is more robust than standalone tools like `Goutte` (PHP-only, no JS).
How do I debug failed crawls or timeouts in production?: Use observers to log errors (e.g., `->onError(function ($url, $exception) { Log::error($exception); })`). For timeouts, adjust Guzzle’s timeout settings or implement retry logic with `shouldRetryCallback()`.

Popularity trends

Recorded values over time (once-a-day snapshots). Jun 22, 2026 – Jul 21, 2026

GitHub · stars

GitHub · forks

GitHub · watchers

Packagist · monthly downloads

View on GitHub

Stars

2,828

Favorites

2,835

Forks

367

Score

55.0

Score breakdown

Sum of components, capped 0–100. Halved if archived.

Stars

input: 2828

+14.1
Forks

input: 367

+11.0
Open issues + PRs

input: 0

+0.0
Releases

input: 38

+11.4
Recency

input: 39

+17.9
Issue opportunity

input: 0

+0.0
Laravel News mentions

input: 0

+0.0
Dependents

input: 0

+0.0

Total 54.4

Opportunity

38.8

Opportunity score breakdown

Hidden gem signal × 0.65 + contribution need × 0.35, scaled by health factor.

Hidden gem

log(monthly_downloads / (stars + 1)) × 25

60.4
Contribution need

open_issues + open_prs: 0

0.0
Health factor

archived + recency + open issues

×0.98

Total 38.6

License

MIT

Last release

Jun 12, 2026

Watchers

Downloads

733K/mo

Dependents

Open issues

Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.

Add packages to context

No packages found.

codraw/framework-extra-bundle

codraw/messenger

codraw/security

codraw/mailer

codraw/contracts

codraw/profiling

codraw/dependency-injection

codraw/tester

codraw/core

nexmo/api-specification

capell-app/block-library

axium/identity

cetria/laravel-dummy-models

cetria/reflection-helper

agropredict/sso-auth-bundle

evolvestudio/spam-protection

datacore/hub-sdk

develia/commons

cuci/prototurk-sdk

cuci/prototurk-sdk-symfony