Search code examples
pythontwittertweepy

How to add a location filter to tweepy module


I have found the following piece of code that works pretty well for letting me view in Python Shell the standard 1% of the twitter firehose:

import sys
import tweepy

consumer_key=""
consumer_secret=""
access_key = ""
access_secret = "" 


auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)


class CustomStreamListener(tweepy.StreamListener):
    def on_status(self, status):
        print status.text

    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True # Don't kill the stream

sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(track=['manchester united'])

How do I add a filter to only parse tweets from a certain location? Ive seen people adding GPS to other twitter related Python code but I cant find anything specific to sapi within the Tweepy module.

Any ideas?

Thanks


Solution

  • The streaming API doesn't allow to filter by location AND keyword simultaneously.

    Bounding boxes do not act as filters for other filter parameters. For example track=twitter&locations=-122.75,36.8,-121.75,37.8 would match any tweets containing the term Twitter (even non-geo tweets) OR coming from the San Francisco area.

    Source: https://dev.twitter.com/docs/streaming-apis/parameters#locations

    What you can do is ask the streaming API for keyword or located tweets and then filter the resulting stream in your app by looking into each tweet.

    If you modify the code as follows you will capture tweets in United Kingdom, then those tweets get filtered to only show those that contain "manchester united"

    import sys
    import tweepy
    
    consumer_key=""
    consumer_secret=""
    access_key=""
    access_secret=""
    
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_key, access_secret)
    api = tweepy.API(auth)
    
    
    class CustomStreamListener(tweepy.StreamListener):
        def on_status(self, status):
            if 'manchester united' in status.text.lower():
                print status.text
    
        def on_error(self, status_code):
            print >> sys.stderr, 'Encountered error with status code:', status_code
            return True # Don't kill the stream
    
        def on_timeout(self):
            print >> sys.stderr, 'Timeout...'
            return True # Don't kill the stream
    
    sapi = tweepy.streaming.Stream(auth, CustomStreamListener())    
    sapi.filter(locations=[-6.38,49.87,1.77,55.81])