Search code examples
phpstringhrefextract

PHP String Manipulation: Extract hrefs


I have a string of HTML that I would like to check to see if there are any links inside of it and, if so, extract them and put them in an array. I can do this in jQuery with the simplicity of its selectors but I cannot find the right methods to use in PHP.

For example, the string may look like this:

<h1>Doctors</h1>
<a title="C - G" href="linkl.html">C - G</a>
<a title="G - K" href="link2.html">G - K</a>
<a title="K - M" href="link3.html">K - M</a>

How (in PHP) can i turn it into an array that looks something like:

[1]=>"link1.html"
[2]=>"link2.html"
[3]=>"link3.html"

Thanks, Ian


Solution

  • You can use PHPs DOMDocument library to parse XML and/or HTML. Something like the following should do the trick, to get the href attribute from a string of HTML.

    $html = '<h1>Doctors</h1>
    <a title="C - G" href="linkl.html">C - G</a>
    <a title="G - K" href="link2.html">G - K</a>
    <a title="K - M" href="link3.html">K - M</a>';
    
    $hrefs = array();
    
    $dom = new DOMDocument();
    $dom->loadHTML($html);
    
    $tags = $dom->getElementsByTagName('a');
    foreach ($tags as $tag) {
           $hrefs[] =  $tag->getAttribute('href');
    }