Search code examples
pythontwitterlocationtweepy

How to filter tweets based on input location's boundary box (City, country, area) in tweepy API python


I am trying to filter the tweet stream based on both input keyword and location (variables).

If I want to filter tweets based on keyword and a certain location, locations parameters in line below

myStream.filter(track=keywords_to_track, locations=boundary_box)

should be a boundary box with four coordinates of the input location (maxlog, minlog, maxlat, minlag)

how to get boundary_box for a given location(variable)? or is there any other way to solve this issue?

I have also tried https://www.mapdevelopers.com/geocode_bounding_box.php, but it's not working.

I am new to tweepy API.

# arguments
topic_name = 'kafkatwitter_1'

#input variables
keywords_to_track = ['modi']
location_filter = 'New Delhi'

# twitter authorization
auth = OAuthHandler(API_KEY, API_KEY_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)

# init tweepy
api = tweepy.API(auth)
producer = KafkaProducer(bootstrap_servers=['localhost:9092'],
                         value_serializer=lambda x: dumps(x).encode('utf-8'),
                         api_version=(0, 10, 1))
class MyStreamListener(tweepy.Stream):
    def on_status(self, tweet):
        length = len(tweet.text.split(' '))
        if (tweet.lang != 'en') or (length <= 10):
            pass
            print("==filtered==")
        else:
            message = {
                "text": tweet.text,
                "created_at": process_time(tweet.created_at),}
        producer.send(topic_name, value=message)
# Step 2: Creating a Stream
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth=api.auth, listener=myStreamListener)

# Step 3: Starting a Stream
myStream.filter(track=keywords_to_track, locations=boundary_box)

Solution

  • You could use the Nominatin API

    Search parameters include:

    • street=<housenumber> <streetname>
    • city=<city>
    • county=<county>
    • state=<state>
    • country=<country>
    • postalcode=<postalcode>

    Example:

    GET https://nominatim.openstreetmap.org/?city=Tokio&format=json&limit=1
    

    Response includes boundingbox:

    [
        {
            "place_id": 282632558,
            "licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
            "osm_type": "relation",
            "osm_id": 1543125,
            "boundingbox": [
                "20.2145811",
                "35.8984245",
                "135.8536855",
                "154.205541"
            ],
            "lat": "35.6828387",
            "lon": "139.7594549",
            "display_name": "Tokyo, Japan",
            "class": "boundary",
            "type": "administrative",
            "importance": 0.7593311914925306,
            "icon": "https://nominatim.openstreetmap.org/ui/mapicons//poi_boundary_administrative.p.20.png"
        }
    ]