Search code examples
phptwitter

PHP Twitter Tweets Language


I'm building a site that uses tweets from Twitters public timeline.

http://twitter.com/statuses/public_timeline.xml

I don't want tweets in Chinese, Russian, etc. I want everything but the tweets that are written in symbols.

Here is an example of what I don't want: スポーツブランドPR、マーケティング。2児の母。好きなもの:ユニコーン、着物、駅伝。

I've tried mb_detect_encoding UTF8 but that isn't working.


Solution

  • All the encoding is the same, the english posts are in UTF-8 too ;)

    There are two options, either find a solution from the Twitter API that you can filter English only posts.

    Or you can use a regex and a loop to filter the posts with non-roman/latin chars in them.

    preg_match('/[^\00-\255]+/u', $post);
    

    Hope this helps,

    Niko