Search code examples
jsonrweatherweather-api

R + Weather Underground - how to use longitude and latitude to get the closest airport station?


How can I find the nearest Airport station by using longitude and latitude?

For instance I have this json data store in my db,

 "location" : {
    "long" : "Devon, 8 Market Road, Plympton, Plymouth PL7 1QW, United Kingdom",
    "street_number" : "",
    "route" : "Market Road",
    "locality" : "Plymouth",
    "administrative_area_level_1" : "England",
    "country" : "United Kingdom",
    "postal_code" : "PL7 1QW",
    "lat" : "50.38693379999999",
    "lng" : "-4.0598999999999705"
}

And I know that my locality is Plymouth, so I will request the stations data from the Weather Underground via this URL below:

http://api.wunderground.com/api/[MY-API-CODE]/geolookup/conditions/q/UK/Plymouth.json

Here is how I do it:

locality <- 'Plymouth'

pullUrl <- paste(apiUrl, 'geolookup/conditions/q/UK/', locality, '.json', sep='')

# Reading in as raw lines from the web service.
conn <- url(pullUrl)
rawData <- readLines(conn, n=-1L, ok=TRUE)

# Convert to a JSON.
geoData <- fromJSON(paste(rawData, collapse=""))

# Get the station data in location only.
# Turn the result into a data frame.
stationsDF <- as.data.frame(geoData$location$nearby_weather_stations$airport$station)

So I get 3 stations below:

      city state        country icao         lat         lon
1 Plymouth       United Kingdom EGDB 50.35491562 -4.12105608
2   Exeter                   UK EGTE 50.73714066 -3.40577006
3 Culdrose                   UK EGDR 50.08427429 -5.25711393

But my problem is how can I ensure that I will get EGDB instead of EGTE or EGDR - because Plympton is closer to Plymouth?

So can I use the lat and lng below in my db to determine which station is the closest?

"lat" : "50.38693379999999",
"lng" : "-4.0598999999999705"

So how can I know the lat and lng above should go for EGDB 50.35491562 -4.12105608?

Any ideas?

EDIT:

stationsDF <- as.data.frame(geoData$location$nearby_weather_stations$airport$station, stringsAsFactors=FALSE)
df <- setDT(stationsDF)
loc <- c(lat = "50.38693379999999", lng = "-4.0598999999999705")
dists <- geosphere::distHaversine(as.numeric(loc[c('lng', 'lat')]), df[, c('lon', 'lat')])

Error:

Error in .pointsToMatrix(p2) * toRad : 
  non-numeric argument to binary operator
In addition: Warning message:
In .pointsToMatrix(p2) : NAs introduced by coercion

EDIT 2:

stationsDF <- as.data.frame(geoData$location$nearby_weather_stations$airport$station, stringsAsFactors=FALSE)
dput(stationsDF)

Output:

structure(list(city = c("Plymouth", "Exeter", "Culdrose"), state = c("", 
"", ""), country = c("United Kingdom", "UK", "UK"), icao = c("EGDB", 
"EGTE", "EGDR"), lat = c("50.35491562", "50.73714066", "50.08427429"
), lon = c("-4.12105608", "-3.40577006", "-5.25711393")), .Names = c("city", 
"state", "country", "icao", "lat", "lon"), class = "data.frame", row.names = c(NA, 
-3L))

EDIT 3:

While:

str(stationsDF)

Output:

'data.frame':   3 obs. of  6 variables:
 $ city   : chr  "Plymouth" "Exeter" "Culdrose"
 $ state  : chr  "" "" ""
 $ country: chr  "United Kingdom" "UK" "UK"
 $ icao   : chr  "EGDB" "EGTE" "EGDR"
 $ lat    : chr  "50.35491562" "50.73714066" "50.08427429"
 $ lon    : chr  "-4.12105608" "-3.40577006" "-5.25711393"

Solution

  • If you've got the data already, say

    df <- read.table(text = 'city state        country icao         lat         lon
                       1 Plymouth     "United Kingdom" EGDB 50.35491562 -4.12105608
                       2   Exeter                   UK EGTE 50.73714066 -3.40577006
                       3 Culdrose                   UK EGDR 50.08427429 -5.25711393', head = T)
    
    loc <- c(lat = "50.38693379999999", lng = "-4.0598999999999705")
    

    Then you can use geosphere::distHaversine to calculate the distances (in meters, by default) betweeen loc and each observation of df:

    dists <- geosphere::distHaversine(as.numeric(loc[c('lng', 'lat')]), df[, c('lon', 'lat')])
    
    dists
    ## [1]  5617.667 60493.398 91661.079
    

    With which.min, you can index df to give you a result:

    df[which.min(dists), ]
    ##   city    state        country icao      lat       lon
    ## 1    1 Plymouth United Kingdom EGDB 50.35492 -4.121056