Search code examples
xmlrgoogle-mapsgeolocationweather-api

Mapping geo-location (coordinates or city) to WorldBank climate API river basin id


I'm interested in historical average temperatures on the city level. WorldBank climate API provides such historical data on the country or river basin level. Country level is too course for me. River basin level would be very good, but I haven't found a way how to convert geo-coordinate to river basins ids to map cities to the river basins. Shape files of the basins are provided by waterbase.org. Any help to map geo-coordinates to WorldBank river basin ids is welcome.


Solution

  • Despite lack of originating code, here's how you can get a complete SpatialPolygonsDataFrame with a basin_id associated with each polygon.

    One part of their API lets you specify a basin ID and that returns a KML file of the region. So you can query all of them and stick'em together:

    library(httr)
    library(sp)
    library(rgdal)
    library(pbapply)
    
    basin_list <- pblapply(1:468, function(basin_id) {
    
      res <- GET(sprintf("http://climatedataapi.worldbank.org/climateweb/rest/v1/basin/kml/%d", basin_id))
      fil <- tempfile(fileext=".xml")
      writeLines(content(res, as="text"), fil)
      kml <- readOGR(fil, "Layer #0", verbose=FALSE, stringsAsFactors=FALSE)
      unlink(fil)
      kml$basin_id = basin_id
      # these are useless
      kml$Name <- kml$Description <-NULL
      kml
    
    })
    
    basin_list <- c(basin_list, makeUniqueIDs=TRUE)
    basins <- do.call(rbind, basin_list)
    basins@data
    
    ##    basin_id
    ## 0         1
    ## 01        2
    ## 02        3
    ## 03        4
    ## 04        5
    ## 05        6
    ## 06        7
    ## 07        8
    ## 08        9
    ## 09       10
    

    Obviously I didn't sit for many minutes for all 468, just 10 of them.

    (Actually, I let it run while doing other work and saved you the trouble of having to generate the files. You can get them at this github repo)

    To see that it is getting all the basins:

    plot(basins) 
    

    enter image description here

    There is an issue with topology errors from the giant data frame, so if you do need to determine which polygon a particular lat/lon pair is in (so you can map it to a basin id) then you have to kinda work at it a bit:

    library(rgeos)
    
    # pick a random point in basin 3
    spot <- spsample(basin_list[[3]], 1, "random")
    
    # now try to find it in reverse
    which(sapply(1:468, function(i) gContains(basin_list[[i]], spot)))
    

    These aren't complex polygons so it's pretty fast.

    Others may have time to simplify this a bit (i.e. correct rbinded polygon topology errors).

    Also, please consider being kind to the WorldBank API server and doing both:

    save(basin_list, basins, file="basins.rda")
    

    and

    writeOGR(basins, ".", "basins", driver="ESRI Shapefile")
    # or 
    geojsonio::geojson_write(basins, geometry="polygon", group="basin_id", "basins.json")
    

    so you have the data sets locally and do not need to recreate them.