Search code examples
rbing-mapsgeodata-partitioning

R - Applying formula through partitioning a dataset


I need to use Bing maps API to get co-ordinate data for a list of pincodes (India). I can do this for small datasets with the code:

get_lat_long <- function(pincodes) {
  # key: https://www.bingmapsportal.com/Application
  require(XML); require(data.table)
  PinCodeLatLong <- data.frame(pincode = "Temp", lat = "Lat", long = "Long")
  for(i in 1:length(pincodes)){ 
    var = pincodes[i] 
    link=paste0("http://dev.virtualearth.net/REST/v1/Locations?postalCode=",var,"&o=xml&maxResults=1&key=<mykey>") 
    #data<- xmlParse(link) 
    xml_data <- xmlToList(xmlParse(link)) 
    PinCodeLatLongtemp <- data.frame(pincode = "Temp", lat = "Lat", long = "Long") 
    PinCodeLatLongtemp$pincode <- var
    PinCodeLatLongtemp$lat <- xml_data$ResourceSets$ResourceSet$Resources$Location$Point$Latitude 
    PinCodeLatLongtemp$long <- xml_data$ResourceSets$ResourceSet$Resources$Location$Point$Longitude 
    PinCodeLatLong <- rbindlist(list(PinCodeLatLongtemp,PinCodeLatLong), fill = T) 
  }
  return(PinCodeLatLong)
}
master_lat_long <- get_lat_long(pincode_map$Pincode)
master_lat_long <- dplyr::filter(master_lat_long, !is.na(pincode))
master_lat_long <- master_lat_long[!duplicated(master_lat_long),]
pincode_map <- merge(pincode_map, master_lat_long, by.x="Pincode", by.y="pincode", all.y=FALSE)

However, Bing Maps Basic API only allows for 2500 datapoints at a time and I need to do this for a large dataset (100,000+). What would be the best way to partition and apply the formula to the file master_lat_long? Is there a way to do this automatically?


Solution

  • There are various ways you can split the data for every 2500 values.

    One way with ceiling would be :

    result <- by(pincode_map$Pincode, 
                 ceiling(seq_len(nrow(pincode_map))/2500), get_lat_long)