Search code examples
phpdomxpathdomparser

Parse anchor tags which have img tag as child element


I need to find all anchor tags, which have an img tag as child element. Consider the following cases,

<a href="test1.php">
 <img src="test1.jpg" alt="Test 1" />
</a>

<a href="test2.php">
 <span>
  <img src="test2.jpg" alt="Test 2" />
 </span>
</a>

My requirement is to generate a list of href attributes along with src and alt ie,

$output = array(
 array(
  'href' => 'test1.php',
  'src'  => 'test1.jpg',
  'alt'  => 'Test 1'
 ),
 array(
  'href' => 'test2.php',
  'src'  => 'test2.jpg',
  'alt'  => 'Test 2'
 )
);

How can I match the above cases in PHP? (Using Dom Xpath or any other dom parser)

Thanks in Advance!


Solution

  • Assuming $doc is a DOMDocument representing your HTML document:

    $output = array();
    $xpath = new DOMXPath($doc);
    # find each img inside a link
    foreach ($xpath->query('//a[@href]//img') as $img) {
    
        # find the link by going up til an <a> is found
        # since we only found <img>s inside an <a>, this should always succeed
        for ($link = $img; $link->tagName !== 'a'; $link = $link->parentNode);
    
        $output[] = array(
            'href' => $link->getAttribute('href'),
            'src'  => $img->getAttribute('src'),
            'alt'  => $img->getAttribute('alt'),
        );
    }