Search code examples
javatwitter4jclouderaflume

Get country from tweet with certain keywords


I am using TwitterSource for Flume from Cloudera. I want to get tweets by country with certain keywords. I'm not sure what to compare to when I want to get tweets from The Netherlands. I have the following which results in nothing being processed:

public void onStatus(Status status) {
    if(status.getPlace().getCountry().equalsIgnoreCase("netherlands")) {
        headers.put("timestamp", String.valueOf(status.getCreatedAt().getTime()));
        Event event = EventBuilder.withBody(DataObjectFactory.getRawJSON(status).getBytes(), headers);
        channel.processEvent(event);
    }
}

The reason I don't use FilterQuery for this is because I want to use this for keywords. If I combine this it would be logical OR and not AND.

FilterQuery query = new FilterQuery().track(keywords);

Solution

  • On analysis you'll find that most of the tweets don't have location attached to it. Also, even if location is attached, the city, state or country may not be available or be correct. Also I've found tweets where such country names literally don't exist. So, you'll have to map city names(or state names) to country names and then check to see if the country matched to the Netherlands. Use Google Maps to achieve this.

    Also you may find my answer here helpful.