What would be the simplest but reliable way to parse the src
attribute of the first <img>
tag found in an arbitrary text string, but without using any external libraries? That means to get everything that is between opening and closing "
character of <img>
tag's src
atrribute.
I did this script, but it is not a reliable solution in some cases:
$string = $item['description'];
$arr = explode('img', $string);
$arr = explode('src', $arr[1]);
$arr = explode('=', $arr[1]);
$arr = explode('>', $arr[1]);
$pos1 = strpos($arr[0], '"')+1;
$pos2 = strrpos($arr[0], '"')-1;
if (!$pos1) {
$pos1 = strpos($arr[0], "'")+1;
$pos2 = strrpos($arr[0], "'")-1;
}
if ($pos1 && $pos2) {
$result = substr($arr[0], $pos1, $pos2);
}
else { $result = null; }
The only safest way is by using DOMDocument
built-in (in PHP 5) class. Use getElementsByTagName()
, check if the length is more than 0, and grab the first item src
value with getAttribute('src')
:
$html = "YOUR_HTML_STRING";
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$imgs = $dom->getElementsByTagName('img');
if ($imgs->length > 0) {
echo $imgs->item(0)->getAttribute('src');
}
See this PHP demo