I am in the process of trying to parse HTML with simple_html_dom.php. The HTML I am trying to parse is shown below. I can successfully grab each product name: Product 1
, Product 2
, Product 3
, etc.
I would also like to grab the itemprice_0
from each product. This is where I am running into issues. Here is my code:
<?php
require_once 'simple_html_dom.php';
$html = file_get_html('https://www.webaddress.com');
foreach($html->find('span.productName') as $e)
echo $e.'<br />'; //successfully displays all product names
foreach($html->find('#itemprice_0') as $e)
echo $e; //doesn't display the item prices
foreach($html->find('.dollar') as $e)
echo $e; //doesn't display the dollar amounts
?>
Here is the HTML:
<span class="productName">Product 1</span>
<p class="price">
<strike>
<span class="dollar-symbol">$</span>
<span class="dollar">15</span><span class="dot">.</span>
<span class="cents">99</span></strike>
</p>
<p class="salePrice" id='itemprice_0'>
<span class="dollar-symbol">$</span>
<span class="dollar">13</span><span class="dot">.</span>
<span class="cents">99</span>
</p>
I accessed the salePrice class and echoed out the dollar amount.
foreach($html->find('span.productName') as $e)
echo $e.'<br />'; //successfully displays all product names
foreach($html->find('p.price') as $e)
$e = str_replace(' ', '', $e);
echo 'Regular Price: ' . $e;
foreach($html->find('p.salePrice') as $e)
$e = str_replace(' ', '', $e);
echo 'Sale Price: ' . $e;
I also removed whitespaces.
Result:
Product 1
Regular Price: $15.99
Sale Price: $13.99
I also made the loop look for the itemprice_0 id only, and got the same result:
foreach($html->find('p[id=itemprice_0]') as $e)
$e = str_replace(' ', '', $e);
echo 'Sale Price: ' . $e;
Same Result:
Product 1
Regular Price: $15.99
Sale Price: $13.99
Is this what you were looking for?