I want to get the HTML content in this page using file_get_contents as string :
https://www.emitennews.com/search/
Then I want to unminify the html code.
So far what I done to unminify it :
$html = file_get_contents("https://www.emitennews.com/search/");
$dom = new \DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadHTML($html,LIBXML_HTML_NOIMPLIED);
$dom->formatOutput = true;
print $dom->saveXML($dom->documentElement);
But in the code above I got is error :
DOMDocument::loadHTML(): Tag header invalid in Entity, line: 1
What is the proper way to do it ?
This is the correct code :
$html = file_get_contents("https://www.emitennews.com/search/");
$dom = new \DOMDocument();
libxml_use_internal_errors(true);
$dom->preserveWhiteSpace = false;
$dom->loadHTML('<?xml encoding="UTF-8">' . $html,LIBXML_HTML_NOIMPLIED);
$dom->formatOutput = true;
print $dom->saveXML($dom->documentElement);
The problem is the site using HTML5. So we need to put :
libxml_use_internal_errors(true);