Search code examples
apitwitterfilter

Filter data in Twitter Streaming API


I'm currently experimenting with the Twitter Streaming API. Everything work's like a charm, but the API sends me ton's of data, which I don't need. Is there a possibility to filter the data the API send me?

I'm using the following stream: https://stream.twitter.com/1.1/statuses/filter.json


Solution

  • Take a look at the filter stream of the api:

    https://dev.twitter.com/docs/api/1.1/post/statuses/filter

    You can enter a set of keywords as a filter to track twitter, according to current limitations you can track up to 400 keywords.

    After retrieving the tweets you have to make a manual filtering again to remove noisy data.

    So if you can specify what you are looking by a set of keywords, you will achieve what you want; but there will always be noise in your data because it is almost impossible to define smtg that precisely through simple keyword filtering.

    For example lets assume you wanna track all tweets related to a brand named XYZ. For getting tweets about brand XYZ you might have a one word keyword set which contains only "XYZ". API will give all the tweets containing XYZ to you, but assume that "XYZ" has a meaning in some language and people of speaking that language will tweet about that word and you will receive that too. Also assume there is a city called XYZ and people will send check-in mesasgees. So at that point you need to filter out tweets that are not related to your topic, either by language detection or contextual information retrieval. But the key is to specify your keyword set about the topic you wanna cover.

    Cheers.