68publishers / crawler-client-php
Requires
- php: ^7.4 || ^8.0
- ext-json: *
- guzzlehttp/guzzle: ^7.7
- jms/serializer: ^3.24
- symfony/yaml: ^5.4 || ^6.0 || ^7.0
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.17
- kubawerlos/php-cs-fixer-custom-fixers: ^3.14
- nette/bootstrap: ^3.1
- nette/di: ^3.1
- nette/tester: ^2.4
- phpstan/phpstan: ^1.10
- phpstan/phpstan-nette: ^1.2
- roave/security-advisories: dev-latest
- symplify/phpstan-rules: 12.0.2.72
README
Crawler Client PHP
PHP Client for 68publishers/crawler
Installation
$ composer require 68publishers/crawler-client-php
Client initialization
The client instance is simply created by calling the static method create()
.
use SixtyEightPublishers\CrawlerClient\CrawlerClient; $client = CrawlerClient::create('<full url to your crawler instance>');
The Guzzle library is used to communicate with the Crawler API. If you want to pass some custom options to the configuration for Guzzle, use the second optional parameter.
use SixtyEightPublishers\CrawlerClient\CrawlerClient; $client = CrawlerClient::create('<full url to your crawler instance>', [ 'timeout' => 0, ]);
Requests to the Crawler API must always be authenticated, so we must provide credentials.
use SixtyEightPublishers\CrawlerClient\CrawlerClient; use SixtyEightPublishers\CrawlerClient\Authentication\Credentials; $client = CrawlerClient::create('<full url to your crawler instance>'); $client = $client->withAuthentication(new Credentials('<username>', '<password>'));
It should be pointed out that the client is immutable - calling the with*
methods always returns a new instance.
This is all that is needed for the client to work properly. You can read about other options on the Advanced options page.
Nette Framework integration
For integration with the Nette Framework please follow this link.
Working with scenarios
Scenarios are handled by ScenarioController
.
use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenariosController; $controller = $client->getController(ScenariosController::class);
List scenarios
/** * @param int $page * @param int $limit * @param array<string, string|array<string>> $filter * * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenarioListingResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException */
$response = $controller->listScenarios(1, 10); $filteredResponse = $controller->listScenarios(1, 10, [ 'name' => 'Test', 'status' => 'failed', ])
Get scenario
/** * @param string $scenarioId * * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenarioResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException */
$response = $controller->getScenario('<id>');
Run scenario
/** * @param \SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody $requestBody * * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ScenarioResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException */
As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.
use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody; $requestBody = new ScenarioRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], config: [ 'scenes' => [ /* ... */ ], 'options' => [ /* ... */ ], 'entrypoint' => [ /* ... */ ], ], ) $response = $controller->runScenario($requestBody);
use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action; $requestBody = new ScenarioRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], config: (new ScenarioConfig(new Entrypoint('<url>', 'default'))) ->withOptions(/* ... */) ->withScene('default', [ new Action('...', [ /* ... */ ]) new Action('...', [ /* ... */ ]) ]), ) $response = $controller->runScenario($requestBody);
Validate scenario
/** * @param \SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody $requestBody * * @returns \SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValidateScenarioResponse */
As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.
use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody; $requestBody = new ScenarioRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], config: [ 'scenes' => [ /* ... */ ], 'options' => [ /* ... */ ], 'entrypoint' => [ /* ... */ ], ], ) $response = $controller->validateScenario($requestBody);
use SixtyEightPublishers\CrawlerClient\Controller\Scenario\RequestBody\ScenarioRequestBody; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action; $requestBody = new ScenarioRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], config: (new ScenarioConfig(new Entrypoint('<url>', 'default'))) ->withOptions(/* ... */) ->withScene('default', [ new Action('...', [ /* ... */ ]) new Action('...', [ /* ... */ ]) ]), ) $response = $controller->validateScenario($requestBody);
Abort scenario
/** * @param string $scenarioId * * @returns \SixtyEightPublishers\CrawlerClient\Controller\Common\NoContentResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException */
$response = $controller->abortScenario('<id>');
Working with scenario schedulers
Scenario schedulers are handled by ScenarioSchedulersController
.
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulersController; $controller = $client->getController(ScenarioSchedulersController::class);
List scenario schedulers
/** * @param int $page * @param int $limit * @param array<string, string|array<string>> $filter * * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerListingResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException */
$response = $controller->listScenarioSchedulers(1, 10); $filteredResponse = $controller->listScenarioSchedulers(1, 10, [ 'name' => 'Test', 'userId' => '<id>', ])
Get scenario scheduler
/** * @param string $scenarioSchedulerId * * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException */
$response = $controller->getScenarioScheduler('<id>'); $etag = $response->getEtag(); # you need Etag for update
Create scenario scheduler
/** * @param \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody $requestBody * * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException */
As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody; $requestBody = new ScenarioSchedulerRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], active: true, expression: '0 2 * * *', config: [ 'scenes' => [ /* ... */ ], 'options' => [ /* ... */ ], 'entrypoint' => [ /* ... */ ], ], ) $response = $controller->createScenarioScheduler($requestBody); $etag = $response->getEtag(); # you need Etag for update
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action; $requestBody = new ScenarioSchedulerRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], active: true, expression: '0 2 * * *', config: (new ScenarioConfig(new Entrypoint('<url>', 'default'))) ->withOptions(/* ... */) ->withScene('default', [ new Action('...', [ /* ... */ ]) new Action('...', [ /* ... */ ]) ]), ) $response = $controller->runScenario($requestBody); $etag = $response->getEtag(); # you need Etag for update
Update scenario scheduler
/** * @param string $scenarioSchedulerId * @param string $etag * @param \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody $requestBody * * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException * @throws \SixtyEightPublishers\CrawlerClient\Exception\PreconditionFailedException */
As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody; $requestBody = new ScenarioSchedulerRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], active: true, expression: '0 2 * * *', config: [ 'scenes' => [ /* ... */ ], 'options' => [ /* ... */ ], 'entrypoint' => [ /* ... */ ], ], ) $response = $controller->updateScenarioScheduler('<id>', '<etag>', $requestBody); $etag = $response->getEtag(); # you need Etag for next update
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action; $requestBody = new ScenarioSchedulerRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], active: true, expression: '0 2 * * *', config: (new ScenarioConfig(new Entrypoint('<url>', 'default'))) ->withOptions(/* ... */) ->withScene('default', [ new Action('...', [ /* ... */ ]) new Action('...', [ /* ... */ ]) ]), ) $response = $controller->updateScenarioScheduler('<id>', '<etag>', $requestBody); $etag = $response->getEtag(); # you need Etag for next update
Validate scenario scheduler
/** * @param \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody $requestBody * * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ValidateScenarioSchedulerResponse */
As a scenario config we can pass a normal array or use prepared value objects. Both options are valid.
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody; $requestBody = new ScenarioSchedulerRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], active: true, expression: '0 2 * * *', config: [ 'scenes' => [ /* ... */ ], 'options' => [ /* ... */ ], 'entrypoint' => [ /* ... */ ], ], ) $response = $controller->validateScenarioScheduler($requestBody);
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\ScenarioConfig; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Entrypoint; use SixtyEightPublishers\CrawlerClient\Controller\Scenario\ValueObject\Action; $requestBody = new ScenarioSchedulerRequestBody( name: 'My scenario', flags: ['my_flag' => 'my_flag_value'], active: true, expression: '0 2 * * *', config: (new ScenarioConfig(new Entrypoint('<url>', 'default'))) ->withOptions(/* ... */) ->withScene('default', [ new Action('...', [ /* ... */ ]) new Action('...', [ /* ... */ ]) ]), ) $response = $controller->validateScenarioScheduler($requestBody);
Activate/deactivate scenario scheduler
/** * @param string $scenarioSchedulerId * * @returns \SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\ScenarioSchedulerResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException */
use SixtyEightPublishers\CrawlerClient\Controller\ScenarioScheduler\RequestBody\ScenarioSchedulerRequestBody; # to activate the scenario scheduler: $response = $controller->activateScenarioScheduler('<id>'); # to deactivate the scenario scheduler: $response = $controller->deactivateScenarioScheduler('<id>');
Delete scenario scheduler
/** * @param string $scenarioSchedulerId * * @returns \SixtyEightPublishers\CrawlerClient\Controller\Common\NoContentResponse * * @throws \SixtyEightPublishers\CrawlerClient\Exception\BadRequestException * @throws \SixtyEightPublishers\CrawlerClient\Exception\NotFoundException */
$response = $controller->deleteScenarioScheduler('<id>');
License
The package is distributed under the MIT License. See LICENSE for more information.