Search code examples
javatwittertwitter4j

twitter4j result.nextquery() is giving results from the first page again


I've created a twitter crawler that gets tweets and its associated data that belongs to certain hashtags. After running it for more than a day, it started giving my old data that already stored in my database. Now I have exactly 216,874 tweets that has #jesuischarlie hashtag. Sure I used QueryResult result=twitter.search(new Query("#jesuischarlie"));
And sure then I have a do-while loop never exits until result.nextQuery()==null

My question is why the function nextQuery() doesn't just give me a null which means Twitter won't provide further tweets for this search? Why is it starting from all over again!?

Here is the full function I'm using

try {
             Query query = new Query("#jesuischarlie");
             query.setSince("2015-01-08");
             query.setCount(100);
             QueryResult result;
             do {
                 result = twitter.search(query);
                 List<Status> tweets = result.getTweets();
                 for (Status tweet : tweets) {
                     Twitter_loop_dao dao = new Twitter_loop_dao();
                    try {

                        dao.insertTwet(tweet);
                    } catch (SQLException e) {

                        e.printStackTrace();
                    }
                 }
                 Thread.sleep(15 * 1000);
             } while ((query = result.nextQuery()) != null);
             System.exit(0);
         } catch (TwitterException te) {
             te.printStackTrace();
             System.out.println("Failed to search tweets: " + te.getMessage());
             System.exit(-1);
         }

Solution

  • Looks like you using the wrong exit condition in your while loop. My code is working (for me).

    do {
            try {
                result = twitter.search(query);
                List<Status> tweets = result.getTweets();
                List<MyObject> myObjects = tweets.parallelStream()
                        .map(tweet -> myTweetFunction(tweet))
                        .collect(Collectors.toList());
    
                query = result.nextQuery();            
                checkRateLimit(result) 
    
            } catch (TwitterException e){
                // do what ever you want
            }
    } while (result == null || result.hasNext());
    

    the checkRateLimit function:

    private void checkRateLimit(QueryResult result) {
    
        if (result.getRateLimitStatus().getRemaining() <= 0){
            try {
              Thread.sleep(result.getRateLimitStatus().getSecondsUntilReset() * 1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
                throw new RuntimeException(e);
            }
        }
    }
    

    Hope that helps.