wimski / html-data-extractor
Extract data from an HTML string by using placeholders in a reverse template.
2.2.0
2022-10-11 10:18 UTC
Requires
- php: ^8.1
- ext-dom: *
- ext-libxml: *
- symfony/css-selector: ^6.1
- symfony/dom-crawler: ^6.1
Requires (Dev)
- phpstan/phpstan: ^1.8
- phpunit/phpunit: ^9.5
README
HTML Data Extractor
This package lets you easily extract data from an HTML string by using a reverse template in Twig style.
Changelog
Setup
Install
composer require wimski/html-data-extractor
Bindings
use Wimski\HtmlDataExtractor\Extractors\HtmlDataExtractor; use Wimski\HtmlDataExtractor\Factories\SelectorFactory; use Wimski\HtmlDataExtractor\HtmlLoader; use Wimski\HtmlDataExtractor\Source\SourceParser; use Wimski\HtmlDataExtractor\Matching\GroupMatcher; use Wimski\HtmlDataExtractor\Matching\PlaceholderMatcher; use Wimski\HtmlDataExtractor\Template\TemplateDataExtractor; use Wimski\HtmlDataExtractor\Template\TemplateGroupsValidator; use Wimski\HtmlDataExtractor\Template\TemplateParser; use Wimski\HtmlDataExtractor\Template\TemplateRootNodeExtractor; use Wimski\HtmlDataExtractor\Template\TemplateValidator; $htmlLoader = new HtmlLoader(); $placeholderMatcher = new PlaceholderMatcher(); $groupMatcher = new GroupMatcher(); $templateGroupsValidator = new TemplateGroupsValidator($htmlLoader, $groupMatcher); $templateValidator = new TemplateValidator($templateGroupsValidator); $selectorFactory = new SelectorFactory($placeholderMatcher); $templateDataExtractor = new TemplateDataExtractor($placeholderMatcher); $templateRootNodeExtractor = new TemplateRootNodeExtractor($htmlLoader); $templateParser = new TemplateParser( $templateValidator, $groupMatcher, $selectorFactory, $templateRootNodeExtractor, $templateDataExtractor, ); $sourceParser = new SourceParser(); $htmlDataExtractor = new HtmlDataExtractor( $templateParser, $sourceParser, );
Usage
Comprehensive documentation has to be written. See HtmlDataExtractorTest
for an example in the meantime.