Search code examples

Get href value with DOMDocument in PHP

Following a file_get_contents, I receive this HTML:

    <a href="blablabla.html">Manhattan Skyline</a>

I want to get the blablabla.html part only.

How can I parse it with DOMDocument feature in PHP?

Important: the HTML I receive contains more than one <a href="...">.

What I try is:

$page = file_get_contents('https://...');
$dom = new DOMDocument();
$xp = new DOMXpath($dom);

$url = $xp->query('h1//a[@href=""]');
$url = $url->item(0)->getAttribute('href');

Thanks for your help.


  • h1//a[@href=""] is looking for an a element with an href attribute with an empty string as the value, whereas your href attribute contains something other than the empty string as the value.

    If that's the entire document, then you could use the expression //a.

    Otherwise, h1//a should work as well.

    If you require the a element to have an href attribute with any kind of value, you could use h1//a[@href].

    If the h1 is not at the root of the document, you might want to use //h1 instead. So the last example would become //h1//a[@href].