Search code examples
phphtmlhtmlpurifier

PHP HTMLPurifier Not Removing Empty Table (but does on 'Live Demo'..)


Working with HTMLPurifier on my localhost with an html string.

Here's my code:

require_once '/htmlpurifier-4.9.2/library/HTMLPurifier.auto.php';

$html = '<table class="product-description-table">
         <tbody>
         <tr>
         <td class="item" colspan="3">Test Title</td>
         </tr>
         <p class="MsoNormal c2"><strong>Test Paragraph 3</strong></p>
         <p class="MsoNormal c2"><strong>Test Paragraph 2</strong></p>
         <p class="MsoNormal c2"><strong>Test Paragraph 3</strong></p>
         <p class="c5"></p>
         <p class="MsoNormal c2"><strong>&nbsp;</strong></p>
         <strong class="c6"><strong><em><br></em></strong></strong>
         <p class="c2"></p>
         <p class="c4"></p>
         </td>
         <td class="product-content-border"></td>
         </tr>
         <tr>
         <td class="gallery" colspan="3">
         <table>
         <tbody>
         <tr>
         <td></td>
         <td></td>
         </tr>
         <tr>
         <td></td>
         <td></td>
         </tr>
         <tr>
         <td></td>
         <td></td>
         </tr>
         <tr>
         <td></td>
         <td></td>
         </tr>
         <tr>
         <td></td>
         <td></td>
         </tr>
         <tr>
         <td></td>
         <td></td>
         </tr>
         <tr>
         <td></td>
         <td></td>
         </tr>
         <tr>
         <td></td>
         <td></td>
         </tr>
         </tbody>
         </table>
         </td>
         </tr>
         </tbody>
         </table>';

         $config = HTMLPurifier_Config::createDefault();
         $config->set('AutoFormat.RemoveEmpty', true);
         $config->set('AutoFormat.RemoveSpansWithoutAttributes', true);
         $purifier = new HTMLPurifier($config);
         $clean_html = $purifier->purify($html);

         echo $clean_html;

Now, with the exact same string, and (assumingly) the exact same filters AutoFormat.RemoveEmpty and AutoFormat.RemoveSpansWithoutAttributes works fine on the Live Demo

Output:

<table class="product-description-table"><tbody><tr><td class="item" colspan="3">Test Title</td>
</tr></tbody></table><p class="MsoNormal c2"><strong>Test Paragraph 3</strong></p>
<p class="MsoNormal c2"><strong>Test Paragraph 2</strong></p>
<p class="MsoNormal c2"><strong>Test Paragraph 3</strong></p>

<p class="MsoNormal c2"><strong> </strong></p>
<strong class="c6"><strong><em><br /></em></strong></strong>

But with my PHP code, when I view source, it is keeping the empty table.

Output:

<table class="product-description-table"><tbody><tr><td class="item" colspan="3">Test Title</td>
</tr></tbody></table><p class="MsoNormal c2"><strong>Test Paragraph 3</strong></p>
<p class="MsoNormal c2"><strong>Test Paragraph 2</strong></p>
<p class="MsoNormal c2"><strong>Test Paragraph 3</strong></p>

<p class="MsoNormal c2"><strong> </strong></p>
<strong class="c6"><strong><em><br /></em></strong></strong>
<table><tbody><tr><td></td>
<td></td>
</tr><tr><td></td>
<td></td>
</tr><tr><td></td>
<td></td>
</tr><tr><td></td>
<td></td>
</tr><tr><td></td>
<td></td>
</tr><tr><td></td>
<td></td>
</tr><tr><td></td>
<td></td>
</tr><tr><td></td>
<td></td>
</tr></tbody></table>

Why is this not working? How is my PHP script not getting the same output as the Live Demo?


Solution

  • Got it worked out with HTML Purifier Support.. all I had to do was add this to my $config

    $config->set('AutoFormat.RemoveEmpty.Predicate', [
        'table' =>
            []
    ]);
    

    and it's working with version 4.9.2. the pesky table is gone.