Search code examples
phphtml-tablehtml-email

How to overcome malformed html in html table in PHP website


I have html table which displays content from email. This email content can be from any email client (browser, outlook, thunderbird...). If this html is malformed it creates a mess in my html table.

For example: if the email content has any unclosed tags or malformed table in it, it is destroying the structure of my table where I am displaying this email content.

Could you please tell me is there any alternative to display this content without any disturbance to my main table......

Here is the screen shot:

http://screencast.com/t/ilUA4b89


Solution

  • you can use DOMDocument to load the HTML content form email. You can then extract the body context and repair the malformed HTML.

    Example:

    <?php
    
        //note that the div tag is not closed properly 
        $emailContent = '<html><head></head><body><div>bla<div><h1>hgjhjdfg</h1></body></html>';
    
        $dom = new DOMDocument();
        $new = new DOMDocument();
        $dom->loadHTML($emailContent);
    
        //extract the body
        $body = $dom->getElementsByTagName('body')->item(0);
    
        foreach ($body->childNodes as $child){
            $new->appendChild($new->importNode($child, true));
        }
    
        //dom document automatically tries to repair malformed HTML so when we use saveHTML().
        echo $new->saveHTML();
    
    ?>
    

    Output:

    <div>bla<div><h1>hgjhjdfg</h1></div></div>
    

    As you can see the unclosed <div> tags are now closed. If you did this with the HTML from e-mail (or wherever you are getting it from) you could fix the html this way.

    However DOMDocument does not do a very good job if it comes to repairing malformed HTML so the results can be unexpected, but at least your table is not affected.

    Alternatives

    You can also use ajax to load every piece of email content separately. The browsers dom does a better job if it comes to repairing malformed HTML.

    Other problems

    You must also filter the email content on javascript. If it contains javascript it can also malform your table or even the entire web page.

    For example someone put this inside a email (jquery): $('body').hide();