Search code examples
phpimagedomdocumentfopensrc

get all image source from a website in php


What I want to do is let the user type a url which has images such as https://www.flickr.com/search/?text=arushad%20ahmed , and get all the image source in the 'src' attribute and display it.

The following approach didn't work:

$file = fopen("https://www.flickr.com/search/?text=arushad%20ahmed", "r");
$doc = new DOMDocument();
$doc->loadHTML($file);
$image = $doc->getElementsByTagName('img');

foreach ($image as $img) {
    echo $img;
}

So how can I make this work as I want?


Solution

  • src isn't a tag, it's an attribute.
    You said you're new to php so that's pretty normal, now worries, use this code:

    $doc = new DOMDocument();
    $doc->loadHTMLFile("https://www.flickr.com/search/?text=arushad%20ahmed");
    $xpath = new DOMXpath($doc);
    $imgs = $xpath->query("//img");
    for ($i=0; $i < $imgs->length; $i++) {
        $img = $imgs->item($i);
        $src = $img->getAttribute("src");
        // do something with $src
    }
    

    Learn more about PHP DOMDocument


    Update

    Based on your comment, you don't seem to have PHP DOMDocument support, you can use the commands below to install it.

    sudo yum --enablerepo=webtatic install php-xml
    sudo /sbin/service httpd stop
    sudo /sbin/service httpd start
    

    Also, the page you're trying to parse doesn't contain valid HTML, use HTML Tidy to fix it, i.e.:

    $html = file_get_contents('https://www.flickr.com/search/?text=arushad%20ahmed');
    $config = array(
      'clean' => 'yes',
      'output-html' => 'yes',
    );
    $tidy = tidy_parse_string($html, $config, 'utf8');
    $tidy->cleanRepair();
    $doc = new DOMDocument();
    $doc->loadHTML($tidy); 
    //the rest of the code is the same
    $xpath = new DOMXpath($doc);
    $imgs = $xpath->query("//img");
    for ($i=0; $i < $imgs->length; $i++) {
        $img = $imgs->item($i);
        $src = $img->getAttribute("src");
        // do something with $src
    }