wimski/html-data-extractor

Extract data from an HTML string by using placeholders in a reverse template.

2.2.0 2022-10-11 10:18 UTC

This package is auto-updated.

Last update: 2024-12-11 15:18:35 UTC


README

PHPStan PHPUnit Coverage Status Latest Stable Version

HTML Data Extractor

This package lets you easily extract data from an HTML string by using a reverse template in Twig style.

Changelog

View the changelog.

Setup

Install

composer require wimski/html-data-extractor

Bindings

use Wimski\HtmlDataExtractor\Extractors\HtmlDataExtractor;
use Wimski\HtmlDataExtractor\Factories\SelectorFactory;
use Wimski\HtmlDataExtractor\HtmlLoader;
use Wimski\HtmlDataExtractor\Source\SourceParser;
use Wimski\HtmlDataExtractor\Matching\GroupMatcher;
use Wimski\HtmlDataExtractor\Matching\PlaceholderMatcher;
use Wimski\HtmlDataExtractor\Template\TemplateDataExtractor;
use Wimski\HtmlDataExtractor\Template\TemplateGroupsValidator;
use Wimski\HtmlDataExtractor\Template\TemplateParser;
use Wimski\HtmlDataExtractor\Template\TemplateRootNodeExtractor;
use Wimski\HtmlDataExtractor\Template\TemplateValidator;

$htmlLoader                = new HtmlLoader();
$placeholderMatcher        = new PlaceholderMatcher();
$groupMatcher              = new GroupMatcher();
$templateGroupsValidator   = new TemplateGroupsValidator($htmlLoader, $groupMatcher);
$templateValidator         = new TemplateValidator($templateGroupsValidator);
$selectorFactory           = new SelectorFactory($placeholderMatcher);
$templateDataExtractor     = new TemplateDataExtractor($placeholderMatcher);
$templateRootNodeExtractor = new TemplateRootNodeExtractor($htmlLoader);

$templateParser = new TemplateParser(
    $templateValidator,
    $groupMatcher,
    $selectorFactory,
    $templateRootNodeExtractor,
    $templateDataExtractor,
);

$sourceParser = new SourceParser();

$htmlDataExtractor = new HtmlDataExtractor(
    $templateParser,
    $sourceParser,
);

Usage

Comprehensive documentation has to be written. See HtmlDataExtractorTest for an example in the meantime.