Search code examples

R - How to populate weather data from weather underground?

How can I populate my local data frame with weather data from weather underground?

For instance I have this local data frame:

structure(list(particles = c(1, 2, 3, 4, 5, 6), timestamp = c(1469945933290, 
1469945937786, 1469945940819, 1469945944000, 1469945948113, 1469945951115
), date = structure(c(1469945933.29, 1469945937.786, 1469945940.819, 
1469945944, 1469945948.113, 1469945951.115), class = c("POSIXct", 
"POSIXt"), tzone = "UTC-1")), .Names = c("particles", "timestamp", 
"date"), row.names = c(NA, -6L), class = "data.frame")

So currently it looks like this:

enter image description here

And I am going to pull weather date in London area:

I will get a set of data like this:

  "response": {
  "features": {
  "geolookup": 1
  "conditions": 1
        ,   "location": {
        "country_name":"United Kingdom",
        "nearby_weather_stations": {
        "airport": {
        "station": [
        { "city":"London / Heathrow Airport", "state":"", "country":"United Kingdom", "icao":"", "lat":"51.47920609", "lon":"-0.45060000" }
        ,{ "city":"London", "state":"", "country":"UK", "icao":"EGLL", "lat":"51.47750092", "lon":"-0.46138901" }
        ,{ "city":"Northolt", "state":"", "country":"UK", "icao":"EGWU", "lat":"51.54868317", "lon":"-0.41691700" }
        ,{ "city":"Farnborough", "state":"", "country":"UK", "icao":"EGLF", "lat":"51.27999115", "lon":"-0.77269602" }
        "pws": {
        "station": [
        "neighborhood":"Burns Way",
        "neighborhood":"Willowbrook Road",
        "city":"Iver, South Bucks",
        "neighborhood":"Thetford Road",
        "neighborhood":"Ashford Surrey",
        "neighborhood":"Oakfield Road",
        "neighborhood":"Hewens Road",
        "neighborhood":"Nelson Road",
        "neighborhood":"Waters Drive",
        "neighborhood":"East Berkshire Weather",
        "neighborhood":"Wills Crescent",
        "neighborhood":"Adelphi Crescent",
        "neighborhood":"Attlee Road",
        "neighborhood":"Langley Weather Station",
        "city":"Langley , Slough",
        "state":"BERKSHIRE. U.K.",
        "neighborhood":"Constance Road",
        "neighborhood":"Prospect Crescent",
        "neighborhood":"Sunbury on Thames",
        "neighborhood":"Hythe Road",
        "country":"UNITED KINGDOM",
        "neighborhood":"Clifton Road",
        "neighborhood":"Osterley Crescent",
        "neighborhood":"Morris Avenue",
        "neighborhood":"Fortescue Avenue",
  , "current_observation": {
        "image": {
        "title":"Weather Underground",
        "display_location": {
        "full":"London, United Kingdom",
        "state_name":"United Kingdom",
        "observation_location": {
        "full":"London, ",
        "elevation":"79 ft"
        "estimated": {
        "observation_time":"Last Updated on July 31, 7:20 AM BST",
        "observation_time_rfc822":"Sun, 31 Jul 2016 07:20:00 +0100",
        "local_time_rfc822":"Sun, 31 Jul 2016 07:26:14 +0100",
        "weather":"Partly Cloudy",
        "temperature_string":"59 F (15 C)",
        "wind_string":"From the Variable at 4 MPH",
        "dewpoint_string":"50 F (10 C)",
        "feelslike_string":"59 F (15 C)",
        "UV":"1","precip_1hr_string":"-9999.00 in (-9999.00 mm)",
        "precip_today_string":"0.00 in (0.0 mm)",

So I want to merge this weather data with my data frame so I can make it into this - when both timestamps (local and weather underground) are matched or closed:

particles timestamp date ws wd    humidity   temperature
xx        xxx       xx   4  300   72         14
and so on... 

Is it possible?

Or is there any other alternatives for weather underground?


  • Load libraries in your workspace


    Get the names of the list in the json file - temp.json

    names_json <- names(fromJSON("temp.json"))

    Get the list of observations

    obs_list <- fromJSON("temp.json")[["current_observation"]]

    Note down the parameters in a list

    params <- list("observation_epoch", "local_tz_long", "wind_string", "wind_degrees", "relative_humidity", "temp_f", "temp_c")

    Using map function from purrr package, loop through the params list and get the values from the obs_list

    new_df <- data.frame(map(.x = params, .f = ~ {obs_list[[.x]]}))

    set the names of new_df as defined in params

    names(new_df) <- params

    Convert the character date into integer date

    new_df$observation_epoch <- as.integer(new_df$observation_epoch)

    Load data.table library into workspace


    Convert new_df and local_df dataframes into datatables by reference


    Convert Integer date into posix date for observation_epoch in new_df

    new_df[, observation_epoch := as.POSIXct(observation_epoch, origin = "1970-01-01", tz = local_tz_long)]

    Combine two datatables based on dates and if there is no match, add NA to it.

    local_df[new_df, on = .(date == observation_epoch), nomatch = NA]


    #  particles timestamp                date local_tz_long                wind_string wind_degrees relative_humidity temp_f   temp_c
    # 1:     NA        NA 2016-07-31 07:20:00 Europe/London From the Variable at 4 MPH            0               72%     59  15 

    Your local_df as per your question do not have a matching date. So I added a matching date for local_df from new_df as follows

    # [1] "2016-07-31 07:18:53 BST"
    local_df$date[1] <- new_df$observation_epoch[1]

    Now merge the two datatables again

    local_df[new_df, on = .(date == observation_epoch), nomatch = NA]


    #   particles    timestamp                date local_tz_long                wind_string wind_degrees relative_humidity temp_f    temp_c
    # 1:        1 1.469946e+12 2016-07-31 07:20:00 Europe/London From the Variable at 4 MPH            0               72%     59    15