Search code examples
phppreg-replaceextractstr-replacesimplepie

Extract specific data from SimplePie get_content object


I have an RSS feed from which I'm trying to extract data though SimplePie (in WordPress).

I have to extract the content tag. It works with <?php echo $item->get_content(); ?>. It throws out all this stuff (of course this is just an entry, the others have the same structure):

<table><tr valign="top">
<td width="67">
<a href="http://www.anobii.com/books/Lapproccio_sistemico_al_governo_dellimpresa/9788813230944/014c5c45a7ddaab1ec/" style="border: 1px solid #333333">
<img src="http://image.anobii.com/anobi/image_book.php?type=3&amp;item_id=014c5c45a7ddaab1ec&amp;time=0">
</a>
</td><td style="margin-left: 10px;padding-left: 10px">[person name] put "[title]" onto shelf<br/></td></tr></table>

Though what I need is just the content inside src="" tag (image url). How can I extract only that?


Solution

  • You can do it using DOMDocument (the best way):

    $doc = new DOMDocument();
    @$doc->loadHTML($html);
    $imgs = $doc->getElementsbyTagName('img');
    $res = $imgs->item(0)->getAttribute('src');
    
    print_r($res);
    

    With a regex (the bad way):

    if (preg_match('~\bsrc\s*=\s*["\']\K[^"\']*+~i', $html, $match))
        print_r($match);