wikimedia/html-formatter

Performs transformations of HTML by wrapping around libxml2 and working around its countless bugs.

Installs: 773 590

Dependents: 1

Suggesters: 0

Security: 0

Stars: 7

Watchers: 15

Forks: 2

pkg:composer/wikimedia/html-formatter

4.1.0 2024-03-13 16:33 UTC

This package is auto-updated.

Last update: 2025-12-08 19:23:43 UTC


README

HtmlFormatter is a library spun off MediaWiki that allows you to load HTML into DomDocument, perform manipulations on it, and then return a HTML string.

Usage

use HtmlFormatter\HtmlFormatter;
// Load HTML that already has doctype and stuff
$formatter = new HtmlFormatter( $html );

// ...or one that doesn't have it
$formatter = new HtmlFormatter( HtmlFormatter::wrapHTML( $html ) );

// Add rules to remove some stuff
$formatter->remove( 'img' );
$formatter->remove( [ '.some_css_class', '#some_id', 'div.some_other_class' ] );
// Only the above syntax is supported, not full CSS/jQuery selectors

// These tags get replaced with their inner HTML,
// e.g. <tag>foo</tag> --> foo
// Only tag names are supported here
$formatter->flatten( 'span' );
$formatter->flatten( [ 'code', 'pre' ] );

// Actually perform the removals
$formatter->filterContent();

// Direct DomDocument manipulations are possible
$formatter->getDoc()->createElement( 'p', 'Appended paragraph' );

// Get resulting HTML
$processedHtml = $formatter->getText();

License

Copyright 2011-2024 MediaWiki contributors

Released under the GNU General Public License version 2, see COPYING.