Search code examples
twitterstreamingtweepy

Tweepy Streamer Limiting Tweets to 140 Characters


I created a tweepy listener to collect tweets into a local MongoDB during the first presidential debate but have realized that the tweets I have been collecting are limited to 140 characters and many are being cut off at the 140 character limit. In my stream I had definied tweet_mode='extended' which I thought would have resolved this issue, however, I am still not able to retrieve the full length of tweets longer than 140 characters. Below is my code:

auth.set_access_token(twitter_credentials.ACCESS_TOKEN, twitter_credentials.ACCESS_TOKEN_SECRET)

api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

# Create a listener MyListener that streams and stores tweets to a local MongoDB

class MyListener(StreamListener):
    
    def __init__(self):
        super().__init__()
        self.list_of_tweets = deque([], maxlen=5)
        
    def on_data(self, data):
        try:
            tweet_text = json.loads(data)
            self.list_of_tweets.append(tweet_text)
            self.print_list_of_tweets()
            db['09292020'].insert_one(tweet_text)
        except:
            None
          
    def on_error(self, status):
        print(status)

    def print_list_of_tweets(self):
        display.clear_output(wait=True)
        for index, tweet_text in enumerate(self.list_of_tweets):
            m='{}. {}\n\n'.format(index, tweet_text)
            print(m)  

debate_stream = Stream(auth, MyListener(), tweet_mode='extended')
debate_stream = debate_stream.filter(track=['insert', 'debate', 'keywords', 'here'])

Any input into how I can obtain the full extended tweet via this listener would be greatly appreciated!


Solution

  • tweet_mode=extended has no effect on the legacy standard streaming API, as Tweets are delivered in both truncated (140) and extended (280) form by default.

    So you'll want your Stream Listener set up like this:

    debate_stream = Stream(auth, MyListener())

    What you should be seeing is that the JSON object for longer Tweets has a text field of 140 characters, but contains an additional dictionary called extended_tweet which in turn contains a full_text field with the full Tweet text.