Search code examples
pythonpython-3.xtwittertweepytwitter-streaming-api

Limit tweepy stream to a specific number


class listener(StreamListener):

def on_status(self, status):
    try:
        userid = status.user.id_str
        geo = str(status.coordinates)
        if geo != "None":
            print(userid + ',' + geo)
        else:
            print("No coordinates")
        return True
    except BaseException as e:
        print('failed on_status,',str(e))
        time.sleep(5)

def on_error(self, status):
    print(status)


auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(locations=[-97.54,32.55,-97.03,33.04])

I have this script for my tweepy stream, and it works perfectly. However, it keeps going until I terminate it using 'ctrl+c'. I tried adding a counter to "on_status" but it does not increment:

 class listener(StreamListener):

def on_status(self, status):
    i = 0
    while i < 10:
        userid = status.user.id_str
        geo = str(status.coordinates)
        if geo != "None":
            print(userid + ',' + geo)
            i += 1

No matter where I put the increment, it repeats itself. If I add "i=0" before the class I get an error:

RuntimeError: No active exception to reraise

Any idea how I can make the counter to work with streaming? The Cursor that comes with tweepy does not work with streaming, as far as I know at least.


Solution

  • Your while logic is not working properly because Tweepy internally calls the on_status() method whenever it receives data. So you can't control the flow of by introducing a conditional inside an already running infinite loop, The best way is to create a new variable inside the class, which gets instantiated when the listener object is created. And increment that variable inside the on_data() method.

    class listener(StreamListener):
    
        def __init__(self):
            super().__init__()
            self.counter = 0
            self.limit = 10
    
        def on_status(self, status):
            try:
                userid = status.user.id_str
                geo = str(status.coordinates)
                if geo != "None":
                    print(userid + ',' + geo)
                else:
                    print("No coordinates")
                self.counter += 1
                if self.counter < self.limit:
                    return True
                else:
                    twitterStream.disconnect()
            except BaseException as e:
                print('failed on_status,',str(e))
                time.sleep(5)