m6web / roboxt
lib used for parsing a robots.txt file
Installs: 572
Dependents: 2
Suggesters: 0
Security: 0
Stars: 19
Watchers: 15
Forks: 7
Open Issues: 5
Requires
- phpcollection/phpcollection: 0.3.0
Requires (Dev)
- phpspec/phpspec: ~2.0.0
This package is not auto-updated.
Last update: 2021-09-27 00:06:57 UTC
README
Roboxt is a PHP robots.txt file parser.
Usage
# Create a Parser instance $parser = new \Roboxt\Parser(); # Parse your robots.txt file $file = $parser->parse("http://www.google.com/robots.txt"); # You can verify that an url is allowed by a specific user agent $tests = [ ["/events", "*"], ["/search", "*"], ["/search", "badbot"], ]; foreach ($tests as $test) { list($url, $agent) = $test; if ($file->isUrlAllowedByUserAgent($url, $agent)) { echo "\n ✔ $url is allowed by $agent"; } else { echo "\n ✘ $url is not allowed by $agent"; } } # You can also iterate over all user agents specified by the robots.txt file # And check the type of each directive foreach ($file->allUserAgents() as $userAgent) { echo "\n Agent {$userAgent->getName()}: \n"; foreach ($userAgent->allDirectives() as $directive) { if ($directive->isDisallow()) { echo " ✘ {$directive->getValue()} \n"; } else if ($directive->isAllow()) { echo " ✔ {$directive->getValue()} \n"; } } }
Installation
The recommended way to install Roboxt is through Composer:
$> composer require m6web/roboxt
Running the Tests
Roboxt uses PHPSpec for the unit tests:
$> composer install --dev $> ./vendor/bin/phpspec run
Credits
- M6Web
- @benja-M-1 and @theodo
License
Roboxt is released under the MIT License.