Search code examples
phphtmlweb-scrapingfile-get-contents

PHP Crawl Specific Tab Content of External Website & Return href


Using PHP, I want to retrieve a specific element in an external website.

The external website is https://mcnmedia.tv/iframe/2684 The specific element I want to retrieve is the first link in the 'Recordings' tab.

For example, the first link contains the following html;

<div class="small-12 medium-6 me column recording-item">
    <div class="recording-item-inner">
        <a class="small-12 column recording-name" href="/recordings/2435">
        <div class="info">
            <b>Mass</b><br>
            <small>26 Mar 2020</small>
        </div><i class="fa fa-play"></i></a>
    </div>
</div>

I want to retrieve the href and display a direct link on my website like;

View Latest Recording - https://mcnmedia.tv/recordings/2435.

I have the following PHP but it isn't working as i'd like, currently it outputs the text only (Mass 26 Mar 2020), I'm not sure how to get the actual href link address?

<?php
$page = file_get_contents('https://mcnmedia.tv/iframe/2684');
@$doc = new DOMDocument();
@$doc->loadHTML($page);   
$xpath = new DomXPath($doc);
$nodeList = $xpath->query("//div[@class='recording-item-inner']");
$node = $nodeList->item(0);
// To check the result:
echo "<p>" . $node->nodeValue . "</p>";
?>

How can I achieve this?


Solution

  • You aren't going quite far enough with your XPath to fetch the href, you can add /a/@href to say use the href attribute inside the <a> tag...

    $nodeList = $xpath->evaluate("//div[@class='recording-item-inner']/a/@href");
    

    you can simplify this, use evaluate() to fetch a specific value and modify the XPath to be fetch the attribute as a string instead of the node...

    $href = $xpath->evaluate("string(//div[@class='recording-item-inner']/a/@href)");
    echo "<p>" . $href . "</p>";