webhubworks/site-crawler

A straightforward site crawler

1.0.4 2024-11-21 06:14 UTC

This package is auto-updated.

Last update: 2024-11-21 06:15:11 UTC


README

Use this site crawler as a quick way to crawl any website. This is useful to detect any slow pages or pages with HTTP errors.

Please use this crawler responsibly. Do not use it to crawl websites that you do not own or have permission to crawl.

Installation

composer global require webhubworks/site-crawler

Usage

Use the help: site-crawler --help

Example: site-crawler https://example.com --limit=50 --basic-auth=user:pass --exclude=action,imprint

Roadmap

  • Add support for websites containing links in JS generated markup
  • Run requests in parallel