Search code examples
apitwittertwitter4j

Get ALL tweets, not just recent ones via twitter API (Using twitter4j - Java)


I've built an app using twitter4j which pulls in a bunch of tweets when I enter a keyword, takes the geolocation out of the tweet (or falls back to profile location) then maps them using ammaps. The problem is I'm only getting a small portion of tweets, is there some kind of limit here? I've got a DB going collecting the tweet data so soon enough it will have a decent amount, but I'm curious as to why I'm only getting tweets within the last 12 hours or so?

For example if I search by my username I only get one tweet, that I sent today.

Thanks for any info!

EDIT: I understand twitter doesn't allow public access to the firehose.. more of why am I limited to only finding tweets of recent?


Solution

  • You need to keep redoing the query, resetting the maxId every time, until you get nothing back. You can also use setSince and setUntil.

    An example:

    Query query = new Query();
    query.setCount(DEFAULT_QUERY_COUNT);
    query.setLang("en");
    // set the bounding dates 
    query.setSince(sdf.format(startDate));
    query.setUntil(sdf.format(endDate));
    
    QueryResult result = searchWithRetry(twitter, query); // searchWithRetry is my function that deals with rate limits
    
    while (result.getTweets().size() != 0) {
    
        List<Status> tweets = result.getTweets();
        System.out.print("# Tweets:\t" + tweets.size());
        Long minId = Long.MAX_VALUE;
    
        for (Status tweet : tweets) {
        // do stuff here            
            if (tweet.getId() < minId)
            minId = tweet.getId();
        }
        query.setMaxId(minId-1);
        result = searchWithRetry(twitter, query);
    

    }