mediashare/crawler

Crawl urls from a webpage and provide a DomCrawler with Scraper Library

Installs: 271

Dependents: 2

Suggesters: 0

Security: 0

Stars: 3

Watchers: 2

Forks: 1

Open Issues: 1

pkg:composer/mediashare/crawler

0.2.8 2021-11-27 19:44 UTC

README

💫 Crawl urls from a webpage and provide a DomCrawler with Scraper Library.

DomCrawler

Scraper use DomCrawler library. This is symfony component for DOM navigation for HTML and XML documents. You can retrieve Documentation Here.

Installation

composer require mediashare/crawler

Usage

<?php
require 'vendor/autoload.php';

use Mediashare\Crawler\Crawler;

$crawler = new Crawler("https://mediashare.fr");
$crawler->run();
dump($crawler);
With Config
<?php
require 'vendor/autoload.php';

use Mediashare\Crawler\Crawler;
use Mediashare\Crawler\Config;

$config = new Config();
$config->setWebspider(true); // All website crawling
$config->setVerbose(true); // Prompt progress bar
$config->setPathRequires(['/Kernel/']); // Not crawl other path
$config->setPathExceptions(['/CodeSnippet/']); // Not crawl this path

$crawler = new Crawler("https://mediashare.fr", $config);
$crawler->run();
dump($crawler);