Search code examples
parsingencodingutf-8rsssimplepie

SimplePie RSS Parser - Encoding and Weird Characters even on UTF-8


I am using SimplePie to Parse an RSS feed, and I am getting the following output:

Don't forget our "Spot It, Post It" .....

My code is:

<?php
header('Content-type:text/html; charset=utf-8');
require_once('rss/simplepie.inc');

// We'll process this feed with all of the default options.
$feed = new SimplePie();

// Set which feed to process.
$feed->set_feed_url('FeedURL');
$feed->enable_cache(true);  
$feed->set_cache_duration(3600);  
$feed->set_cache_location('cache');
$feed->init();  
$feed->handle_content_type();  
?>

I'm using HTML5 Doctype AND I also have: <meta charset="charset=utf-8">

I've looked it up and everything talks about changing the charset to UTF-8 which I clearly have.. so I'm not too sure what else is causing this.

Any ideas?


Solution

  • Does this happen with every feed? Or just one particular feed? It might be the feed itself. You can use $item->get_content() and look at the content of the feed directly if the description itself is proving problematic. Sometimes it is necessary to do processing on information from a feed or web API, there is PHP code and examples for stripping and replacing characters, the News Blocks 2.0 demo on the SimplePie site has some cleaning code I've been using a lot recently.

    Good luck.