Search code examples
pythondjangotweetstream

Python, handle persistent http connection in web application


I'm in a bit over my head as a beginner with python, but I've managed to setup a connection to the twitter streaming api via a Django application and tweetstream.

Within the application I can do the following and get a constant stream of tweets outputted to the console via the django test web server.

with tweetstream.FilterStream(arg, arg, arg, arg, arg) as stream:
   for tweet in stream:
       print tweet

I can also do something like this so I can query the stream for information.

my_tweetstream = tweetstream.FilterStream(arg, arg, arg, arg, arg)
print my_tweetstream.variable

Ideally, I'd like to start tweetstream so that it's able to log tweets, but also be able to visit an admin page which on refresh would query the connection and return data on how long it's been connected, how many tweets have been returned etc

The problem is I have no idea how this can be done with the code I have got so far. For instance, how can i 'store' the connection so i can query it?

Please would someone mind explaining the right approach to going about this, and which resources might help me understand the problem better?

Thanks in advance,


Solution

  • I did this recently for a project. You'll need to run the stream consumer as a separate python process. It doesn't need to be part of your Django application at all.

    Basically I had:

    from tweepy import OAuthHandler
    from tweepy import Stream
    from tweepy.streaming import StreamListener
    
    from myproject.myapp.utils import do_something_with_tweet
    
    class StdOutListener(StreamListener):
    
        def on_data(self, data):
            do_something_with_tweet(data)
            return True
    
    def main():
        listener = StdOutListener()
    
        auth = OAuthHandler(
            TWITTER_CONSUMER_KEY,
            TWITTER_CONSUMER_SECRET)
    
        auth.set_access_token(
            TWITTER_ACCESS_TOKEN,
            TWITTER_ACCESS_SECRET)
    
        try:
            stream = Stream(auth, listener)
            stream.filter(track=['#something', ])
        except (KeyboardInterrupt, SystemExit):
            print 'Stopping Twitter Streaming Client'
    
    
    if __name__ == '__main__':
        main()
    

    This way you can run this as a separate process and pass the tweet data to some function to save it or whatever and Django can run happily elsewhere.

    Plus points would be to use celery to process your tweet data in asynchronous tasks: https://celery.readthedocs.org