I'm getting all the Tweets that I need from a Twitter account. More than 200 Tweets; for example 500, 600, ...
I'm using the Tweepy library to help me to do this with Python, and I have created this object to do this.
from rrss.twitter_connection import TwitterConnection
import tweepy
class Tweets:
def __init__(self):
self.all_tweets = [] # List of tweets
self.__total_tweets = None
self.__screen_name = None
self.__replies = None
def __del__(self):
del self.all_tweets
del self.screen_name
del self.total_tweets
del self.replies
@property
def screen_name(self): # Screen name of twitter account which we are going to retrieve all their tweets
return self.__screen_name
@screen_name.setter
def screen_name(self, screen_name):
self.__screen_name = screen_name
@screen_name.deleter
def screen_name(self):
del self.__screen_name
@property
def total_tweets(self): # Total tweets which wants to be returned
return self.__total_tweets
@total_tweets.setter
def total_tweets(self, total):
self.__total_tweets = total
@total_tweets.deleter
def total_tweets(self):
del self.__total_tweets
@property
def replies(self):
return self.__replies
@replies.setter
def replies(self, replies):
self.__replies = replies
@replies.deleter
def replies(self):
del self.__replies
@staticmethod
def __get_tweets(total, screen_name, oldest_id=None):
"""
:param total: Number of tweets to return
:param screen_name: Twitter account
:param oldest_id: The last id of the tweet retrieved
:return: A list with at least a number of tweets equal to variable total from the Twitter Account relationed to screen_name variable
"""
api = TwitterConnection().api
if oldest_id is None:
tweets = api.user_timeline(screen_name=screen_name, count=total, include_rts=False, tweet_mode="extended")
else:
tweets = api.user_timeline(screen_name=screen_name, count=total, include_rts=False, max_id=oldest_id - 1, tweet_mode="extended")
return tweets
def get_tweets(self, total, screen_name):
"""
Public method to get a total number of tweets from a screen name
:param total: Total of tweets to retrieve from a screen name
:param screen_name: Twitter account
:return: Update self.all_tweets with all the tweets retrievedd
"""
self.screen_name = screen_name
if total <= 200:
self.all_tweets = Tweets.__get_tweets(total, screen_name)
else:
counter = 200
self.all_tweets = Tweets.__get_tweets(counter, screen_name)
oldest_id = self.all_tweets[-1].id
while len(self.all_tweets) < total:
total_block_tweets = 200 if total - counter > 200 else total - counter
tweets = Tweets.__get_tweets(total_block_tweets, screen_name, oldest_id)
if len(tweets) > 0:
self.all_tweets.extend(tweets)
oldest_id = self.all_tweets[-1].id
counter = len(self.all_tweets)
else:
break
def get_replies(self, tweet_id):
api = TwitterConnection().api
self.replies = tweepy.Cursor(api.search, q='to:{}'.format(self.screen_name), since_id=tweet_id, tweet_mode='extended').items()
def search_replies_to_tweet(self, tweet_id):
while True:
try:
reply = self.replies.next()
print(reply.in_reply_to_status_id)
if reply.in_reply_to_status_id == tweet_id:
print("reply of tweet:{}".format(reply.full_text))
if reply.in_reply_to_status_id_str == str(tweet_id):
except StopIteration:
print("El cursor ha llegado a su final!!!")
break
With this code, you can get all the Tweets from the Twitter account "MovistarEstu":
def main():
t = Tweets()
t.get_tweets(200, "MovistarEstu")
i = 0
for info in t.all_tweets:
print(f"i: {i} - ID: {info.id} - created_at: {info.created_at}")
print(f"text: {info.full_text}\n")
i += 1
You get all the Tweets and then you print some info about them. All of this works fine. But my problem comes when I try to get all the replies to all the Tweets created by "MovistarEstu" since an ID. I've got some replies but not all.
For example, I've got the replies for the Tweet with ID: 1403443418085265411 but not with ID: 1391368878861824002, and I don't know why :(
With this code, I try to get all the Tweets from "MovistarEstu" since ID: 1391364490286047238
t.get_replies(1391364490286047238)
And now, I try to get all the replies to "MovistarEstu" to this ID Tweet: 1391368878861824002
t.search_replies_to_tweet(1391368878861824002)
But, I don't get anything. However, If you go to Twitter you can check that there are replies: https://twitter.com/MovistarEstu/status/1391368878861824002
If you try to get all the replies for this ID: 1403443418085265411
t.search_replies_to_tweet(1403443418085265411)
Then, I can found the replies!!!
reply of tweet:@MovistarEstu Victoria en el 4 partido de la final
reply of tweet:@MovistarEstu Momento que no volveremos a ver en la puta vida
reply of tweet:@MovistarEstu Es buenísimo porque el CM del @MovistarEstu está boicoteando constantemente a su directiva haciéndonos recordar que el pasado fue glorioso y que nos han llevado a la absoluta mediocridad.
reply of tweet:@MovistarEstu No me habéis pedido permiso para usar la foto 🤔
reply of tweet:@MovistarEstu Yo estaba ahí con mis compis de cantera
reply of tweet:@MovistarEstu Que salgan los toreros oh oh oh!!!! reply of tweet:@MovistarEstu Entonces salían los toreros habitualmente, ahora sólo salen los torreznos
reply of tweet:@MovistarEstu Cualquier tiempo pasado fue mejor. Asensio ya estaba por aquel entonces mamando del frasco?
reply of tweet:@MovistarEstu Claro, cuando Nacho aprobó la selectividad a la 17a
reply of tweet:@MovistarEstu 17 años ya!!! Lo recuerdo como si fuera ayer. Se forzó quinto partido de la final ACB con el Farsa. Patterson, Nicola Loncar...
reply of tweet:@MovistarEstu Segundo partido en Vistalegre de la final de liga contra el FCBarcelona. Tremenda exhibición, ambientazo en las gradas y 2-2. Todo se decidirá en el Palau (cuando ya debía estar finiquitada la final tras algún arbitraje "ejem-ejem" en Barcelona)...
reply of tweet:@MovistarEstu Pase a la final ACB?
What am I doing wrong?
From the documentation for Twitter's standard search API that Tweepy's API.search
uses:
Keep in mind that the search index has a 7-day limit. In other words, no tweets will be found for a date older than one week.
https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/guides/standard-operators also says:
The Search API is not a complete index of all Tweets, but instead an index of recent Tweets. The index includes between 6-9 days of Tweets.