Search code examples
rtext-miningtm

Cant get lat and longitude values of tweets


I collected some twitter data doing this:

#connect to twitter API
setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)

#set radius and amount of requests
N=200  # tweets to request from each query
S=200  # radius in miles

lats=c(38.9,40.7)
lons=c(-77,-74)

roger=do.call(rbind,lapply(1:length(lats), function(i) searchTwitter('Roger+Federer',
                                                                lang="en",n=N,resultType="recent",
                                                              geocode=paste  (lats[i],lons[i],paste0(S,"mi"),sep=","))))

After this I've done:

rogerlat=sapply(roger, function(x) as.numeric(x$getLatitude()))
rogerlat=sapply(rogerlat, function(z) ifelse(length(z)==0,NA,z))  

rogerlon=sapply(roger, function(x) as.numeric(x$getLongitude()))
rogerlon=sapply(rogerlon, function(z) ifelse(length(z)==0,NA,z))  

data=as.data.frame(cbind(lat=rogerlat,lon=rogerlon))

And now I would like to get all the tweets that have long and lat values:

data=filter(data, !is.na(lat),!is.na(lon))
lonlat=select(data,lon,lat)

But now I only get NA values.... Any thoughts on what goes wrong here?


Solution

  • As Chris mentioned, searchTwitter does not return the lat-long of a tweet. You can see this by going to the twitteR documentation, which tells us that it returns a status object.

    Status Objects

    Scrolling down to the status object, you can see that 11 pieces of information are included, but lat-long is not one of them. However, we are not completely lost, because the user's screen name is returned.

    If we look at the user object, we see that a user's object at least includes a location.

    So I can think of at least two possible solutions, depending on what your use case is.

    Solution 1: Extracting a User's Location

    # Search for recent Trump tweets #
    tweets <- searchTwitter('Trump', lang="en",n=N,resultType="recent",
                  geocode='38.9,-77,50mi')
    
    # If you want, convert tweets to a data frame #
    tweets.df <- twListToDF(tweets)
    
    # Look up the users #
    users <- lookupUsers(tweets.df$screenName)
    
    # Convert users to a dataframe, look at their location#
    users_df <- twListToDF(users)
    
    table(users_df[1:10, 'location'])
    
                                           ❤ Texas  ❤ ALT.SEATTLE.INTERNET.UR.FACE 
                       2                            1                            1 
                   Japan             Land of the Free                  New Orleans 
                       1                            1                            1 
      Springfield OR USA                United States                          USA 
                       1                            1                            1 
    
    # Note that these will be the users' self-reported locations,
    # so potentially they are not that useful
    

    Solution 2: Multiple searches with limited radius

    The other solution would be to conduct a series of repeated searches, increment your latitude and longitude with a small radius. That way you can be relatively sure that the user is close to your specified location.