I am using simple html dom parser to parse a link with php. Below the url and php code which I am using.
URL:
https://homeshopping.pk/products/-Imported-Stretchable-Tights-For-Women--Pack-Of-3-.html
PHP Script:
$html = file_get_html('https://homeshopping.pk/products/-Imported-Stretchable-Tights-For-Women--Pack-Of-3-.html');
foreach($html->find('div#ProductDescription_Tab') as $description)
{
$comments = $description->find('.hsn_comments', 0);
$comments->outertext = '';
print $description->outertext ;
}
The problem is that after running the script I am getting the front end as I Want but viewing page source shows a lot of javascript and css junk code. Is it ok? Cant I get only the html tags without any extra css or javascript code?. Below are the images of my php script view page source after running the script.
if you are using the latest version of simpleHTMLDom,you can use the remove()
function. here is a sample code based on your existing code
$html = file_get_html('https://homeshopping.pk/products/-Imported-Stretchable-Tights-For-Women--Pack-Of-3-.html');
foreach($html->find('div#ProductDescription_Tab') as $description)
{
$comments = $description->find('.hsn_comments', 0);
$comments->outertext = '';
//remove div with script
$description->find('div#flix-minisite',0)->remove();
$description->find('div#flix-inpage',0)->remove();
//will remove all <script> tags
foreach($description->find('script') as $s) $s->remove();
//wil remove all <style> tags
foreach($description->find('style') as $s) $s->remove();
echo $description->innertext ;
}