timgws/cleanhtml

Quickly & Easily clean out HTML text, making sure that only the bare minimum is left behind

dev-master 2016-03-11 04:02 UTC

This package is auto-updated.

Last update: 2024-11-08 12:02:40 UTC


README

Test Coverage Code Climate

Making HTML clean since late 2012!

Requirements

  • PHP 5.2+
  • php-xml

How to install

    composer require timgws/cleanhtml

How to use

use timgws\CleanHTML\CleanHTML;
$tidy = new CleanHTML();
$output = $tidy->clean('<p><strong>I need a shower. I am dirty HTML.</strong>');

$output should now contain:

<h2>I need a shower. I am dirty HTML.</h2>

Using the Clean function will remove tables, any Javascript or other non-friendly items that you might not want to see from user submitted HTML.

If you want to see some examples, the best place to look would be some of the CleanHTML test

What does it do?

  1. Removed additional spaces from HTML
  2. Replaces multiple <br /> tags with paragraph tags
  3. Removes any <script> tags
  4. Renames any <h1> tags to <h2>
  5. Changes <p><strong> tags to <h2>
  6. Replaces <h2><strong> with just <h2> tags
  7. Removes weird <p><span> tags
  8. Uses HTML purifier to only allow h1,h2,h3,h4,h5,p,strong,b,ul,ol,li,hr,pre,code tags
  9. Runs steps 3->7 one more time, just to catch anything that might have missed by allowed tags
  10. Outputs nice clean HTML \o/