Search code examples
phpregexparsingsrc

PHP - parsing src attribute of <img> tag in string


What would be the simplest but reliable way to parse the src attribute of the first <img> tag found in an arbitrary text string, but without using any external libraries? That means to get everything that is between opening and closing " character of <img> tag's src atrribute.


I did this script, but it is not a reliable solution in some cases:

  $string = $item['description'];
  $arr = explode('img', $string);
  $arr = explode('src', $arr[1]);
  $arr = explode('=', $arr[1]);
  $arr = explode('>', $arr[1]);

  $pos1 = strpos($arr[0], '"')+1;
  $pos2 = strrpos($arr[0], '"')-1;

  if (!$pos1) {
    $pos1 = strpos($arr[0], "'")+1;
    $pos2 = strrpos($arr[0], "'")-1;
  }

  if ($pos1 && $pos2) { 
    $result = substr($arr[0], $pos1, $pos2); 
  }
  else { $result = null; }

Solution

  • The only safest way is by using DOMDocument built-in (in PHP 5) class. Use getElementsByTagName(), check if the length is more than 0, and grab the first item src value with getAttribute('src'):

    $html = "YOUR_HTML_STRING";
    $dom = new DOMDocument('1.0', 'UTF-8');
    $dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
    $imgs = $dom->getElementsByTagName('img');
    if ($imgs->length > 0) {
        echo $imgs->item(0)->getAttribute('src');
    }
    

    See this PHP demo