Search code examples
phpxmlrsscdata

Extracting data from rss containing <![CDATA[]]> with PHP


This is a description item I get from the rss feed:

        <description><![CDATA[ <img src="http://images.24ur.com/media/images/210/Nov2012/61090877.jpg" alt="24ur.com"/>
        Na sedeĹžu Evropske nogometne zveze v Nyonu so izĹžrebali pare osmine finala Lige prvakov. BrĹžkone bo najbolj vroÄe v Madridu, kjer se bo zasedba Reala uvodoma udarila z Manchester Unitedom, povratni dvoboj pa bosta velikana evropskega nogometa odigrala v Manchestru.]]></description>

It contains this CDATA tag which cannot be parsed with xml parser. if I

echo $test->description;

I see the img in the browser, but I cannot access the src in the script. Any idea how to do it??


Solution

  • You can not access XML inside CDATA section as XML.
    You need to parse it with regular expression to fetch the src.
    Or open it as another XML.

    Tested & works:

    $h = '<img src="http://images.24ur.com/media/images/210/Nov2012/61090877.jpg" alt="24ur.com"/>';
    
    preg_match("/http:\/\/(.*?)[^\"']+/", $h, $matches);
    var_dump($matches[0]);
    

    Outputs:

    string(60) "http://images.24ur.com/media/images/210/Nov2012/61090877.jpg"