vanilla / htmlawed
A composer wrapper for the htmLawed library to purify & filter HTML. Tested with PHPUnit and PhantomJS!
Installs: 511 229
Dependents: 6
Suggesters: 0
Security: 0
Stars: 40
Watchers: 17
Forks: 15
Open Issues: 3
Language:HTML
Requires
- php: >=7.4.0
Requires (Dev)
- phpunit/phpunit: ^9.0
- tburry/pquery: ~1.0.1
This package is auto-updated.
Last update: 2024-10-29 20:50:00 UTC
README
A composer wrapper for the htmLawed library to purify & filter HTML. Tested with PHPUnit and PhantomJS.
Why use htmLawed?
If your website has any user-generated content then you need to worry about cross-site scripting (XSS). htmLawed will take a piece of potentially malicious html and remove the malicious code, leaving the rest of html behind.
Beyond the base htmLawed library, this package makes htmLawed a composer package and wraps it in an object so that it can be autoloaded.
Installation
htmLawed requres PHP 5.4 or higher
htmLawed is PSR-4 compliant and can be installed using composer. Just add vanilla/htmlawed
to your composer.json.
"require": { "vanilla/htmlawed": "~1.0" }
Example
echo Htmlawed::filter('<h1>Hello world!'); // Outputs: '<h1>Hello world!</h1>'. echo Htmlawed::filter('<i>nothing to see</i><script>alert("xss")</script>') // Outputs: '<i>nothing to see</i>alert("xss")'
Configs and Specs
The htmLawed filter takes two optional parameters: $config
and $spec
. This library provides sensible defaults to these parameters, but you can override them in Htmlawed::filter()
.
$xss = "<i>nothing to see <script>alert('xss')</script>"; // Pass an empty config and spec for no filtering of malicious code. echo Htmlawed::filter($xss, [], []); // Outputs: '<i>nothing to see <script type="text/javascript">alert("xss")</script></i>' // Pass safe=1 to turn on all the safe options. echo Htmlawed::filter($xss, ['safe' => 1]); // Outputs: '<i>nothing to see alert("xss")</i>' // We provide a convenience method that strips all tags that aren't supposed to be in rss feeds. echo Htmlawed::filterRSS('<html><body><h1>Hello world!</h1></body></html>'); // Outputs: '<h1>Hello world!</h1>'
See the htmLawed documentation for the full list of options.
Differences in Vanilla's version of Htmlawed
We try and use the most recent version of htmLawed with as few changes as possible so that bug fixes and security releases can be merged from the main project. However, We've made a few changes in the source code.
- Balance tags (hl_bal) before validating tags (hl_tag). We found some cases where an unbalanced script tag would not get removed and this addresses that issue.
- Don't add an extra
<div>
inside of<blockquote>
tags. - Remove naked
<span>
. - Change indentation from 1 space to 4 spaces.
If the original author of htmLawed wants to make any of these changes upstream please get in contact with support@vanillaforums.com.