Iam currently doing a tweet search using Twitter Api
. However, taking the tweet id is not working for me.
Here is my code:
searchQuery = '#BLM' # this is what we're searching for
searchQuery = searchQuery + "-filter:retweets"
Geocode="39.8, -95.583068847656, 2500km"
maxTweets = 1000000 # Some arbitrary large number
tweetsPerQry = 100 # this is the max the API permits
fName = 'tweetsBLM.json' # We'll store the tweets in a json file.
sinceId = None
#max_id = -1 # initial search
max_id=1278836959926980609 # the last id of previous search
tweetCount = 0
print("Downloading max {0} tweets".format(maxTweets))
with open(fName, 'w') as f:
while tweetCount < maxTweets:
try:
if (max_id <= 0):
if (not sinceId):
new_tweets = api.search(q=searchQuery,lang="en", geocode=Geocode,
count=tweetsPerQry)
else:
new_tweets = api.search(q=searchQuery,lang="en",geocode=Geocode,
count=tweetsPerQry,
since_id=sinceId )
else:
if (not sinceId):
new_tweets = api.search(q=searchQuery, lang="en", geocode=Geocode,
count=tweetsPerQry,
max_id=str(max_id - 1) )
else:
new_tweets = api.search(q=searchQuery, lang="en", geocode=Geocode,
count=tweetsPerQry,
max_id=str(max_id - 1),
since_id=sinceId)
if not new_tweets:
print("No more tweets found")
break
for tweet in new_tweets:
f.write(jsonpickle.encode(tweet._json, unpicklable=False) +
'\n')
tweetCount += len(new_tweets)
print("Downloaded {0} tweets".format(tweetCount))
max_id = new_tweets[-1].id
except tweepy.TweepError as e:
# Just exit if any error
print("some error : " + str(e))
print('exception raised, waiting 15 minutes')
print('(until:', dt.datetime.now() + dt.timedelta(minutes=15), ')')
time.sleep(15*60)
break
print ("Downloaded {0} tweets, Saved to {1}".format(tweetCount, fName))
This code works perfectly fine. I initially run it and got about 40 000 tweets. Then i took the id
of the last tweet
of previous/initial search to go back in time. However, i was disappointed to see that there were no tweets anymore. I can not believe that for a second. I must be going wrong somewhere because #BLM
has been very active in the last 2/3 months.
Any help is very welcome. Thank you
I may have found the answer. Using Twitter API
, it is not possible to get older tweets (7 days old or more). Using max_id
to get around this is not possible either.
The only way is to stream and wait for more than 7 days.
Finally, there is also this link that look for older tweets
https://pypi.org/project/GetOldTweets3/ it is an extension of the original Jefferson Henrique's work