DOM parsing in php works only if the HTML is perfectly tagged. I need to parse html which is not a perfect DOM. And that HTML is from remote server so i can't change it.
<html>
<body>
<table>
<tr>
<td>
1
</td>
<td>
2
</td></td>
</tr>
</table>
when i parse html with this structure it gives an error. Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: Unexpected end tag : td in Entity, line: 173 in C:\wamp\wwwxxxxxx on line 51
Tools such as tidy
should be able to repair the HTML so you can use it in DOM.
$html = "<html>
<body>
<table>
<tr>
<td>
1
</td>
<td>
2
</td></td>
</tr>
</table>";
$tidy = tidy_parse_string($html);
$html = $tidy->html();
$cleanHTML = $html->value;
$doc = new DomDocument();
$doc-> loadhtml($cleanHTML);
Note: Tidy is not shipped with PHP, you would have to install the extension to use the functions