Search code examples
rvisualizationdistancelatitude-longitudenearest-neighbor

Using R spatial to calculate the nearest weather stations to sample plots using coordinates


I have 2 dataset, one contains information on 18 sites (site_name, lat and long) and the second contains 81 weather stations data (met_name, lat and long). i want to calculate the distance between each sites and the weather stations in other to get the nearest weather stations. However, i want it in a visual form so that i can see the next weather station close to the sites incase the nearest one does not have a complete information that i need.

I have tried this code i found in one of the thread here but it is not working well for me. I will also like to point out that i converted the data from data.frame to vector which was stressful.

library(sp)
library(raster)

CompID<-c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "A10", "A11", "A12", 
      "A13", "A14", "A15", "A16", "A17", "A18")
sitexy<- matrix(c(-28.4,-28.7,-28.6,-28.5,-28.4,-28.7,-28.2,-28.3,-28.9,-28.6,-28.5,-28.5,-28.7,
             -28.7,-28.7,-28.5,-28.5,-28.5,32.2,32.1,32.2,32.2,32.2,32.0,32.3,32.2,31.9,32.2,32.2,
              32.2,32.1,32.1,32.1,32.1,32.1,32.1), ncol=2, byrow = TRUE)
MetID<-c("M1", "M2", "M3", "M4", "M5", "M6", "M7", "M8", "M9", "M10", "M11", "M12", "M13", "M14", 
         "M15","M16", "M17", "M18", "M19", "M20", "M21", "M22", "M23", "M24", "M25", "M26", "M27", 
         "M28", "M29", "M30", "M31","M32", "M33", "M34", "M35", "M36", "M37", "M38", "M39", "M40", 
          "M41", "M42","M43", "M44", "M45", "M46", "M47","M48", "M49", "M50", "M51", "M52", "M53", 
         "M54", "M55", "M56", "M57","M58", "M59", "M60", "M61", "M62", "M63", "M64","M65", "M66", 
         "M67", "M68", "M69", "M70", "M71","M72", "M73", "M74", "M75", "M76", "M77", "M78", "M79", 
         "M80", "M81")

metxy<-matrix(c(-28.74, -28.44, -28.20, -28.72, -28.45, -31.06, -30.75, -30.86, -30.15, -30.40, 
               -29.79,
            -29.94, -29.54, -29.63, -29.65, -29.71, -29.77, -29.61, -29.60, -29.26, -29.22, -30.50,
            -29.08, -29.16, -28.69, -28.58, -29.01, -28.95, -28.85, -28.74, -28.38, -28.36, -28.31,
            28.44, -28.20, -27.78, -27.41, -27.39, -27.57, -27.31, -27.55, -27.13, -27.39, -29.06,
            -29.05, -29.25, -29.04, -29.20, -29.32, -29.26, -29.64, -26.85, -28.58, -28.50, -26.94,
            -27.01, -27.03, -29.94, -29.83, -30.13, -29.57, -29.65, -30.21, -28.60, -28.89, -28.35,
            -28.22, 28.72, 28.90, 28.90, 29.02, 27.42, 28.58, 28.50, 27.63, 28.45, 27.40,28.45,28.60, 
             28.35, 28.22, 32.09, 32.18, 32.41, 31.88, 32.22, 30.22, 30.26, 30.34, 30.07, 
            30.69, 29.35, 29.96, 30.27, 30.40, 30.40, 31.05, 31.06, 31.12, 31.13, 29.52, 30.00, 
            29.39, 30.60, 31.40, 28.95, 29.75, 29.86, 31.71, 31.85, 32.09, 29.39, 31.21, 31.42, 
            32.18, 32.41, 30.80, 31.59, 32.18, 30.82, 30.71, 30.51, 30.91, 30.83, 30.82, 30.89, 
            30.33, 30.52, 30.67, 30.58, 30.50, 30.51, 30.46, 31.33, 31.34, 30.83, 30.79, 30.61, 
            29.91, 30.31, 29.89, 30.27, 29.97, 30.08, 32.18, 31.83, 32.25, 32.36, 31.88, 31.31, 
            31.43, 31.60, 32.20, 31.33, 31.34, 32.02, 32.22, 31.58, 32.28, 32.18, 32.25, 32.36), ncol 
            = 2, byrow = TRUE)


d<-pointDistance(sitexy, metxy, lonlat = TRUE)
r<-apply(d, 1, which.min)
r
p<-data.frame(site=CompID, Met=MetID [r])
p

plot(rbind(metxy, sitexy), ylim=c(0,45), xlab='longitude', ylab='latitude')
points(metxy, col="red", pch=20, cex=2)
points(sitexy, col="blue", pch=20, cex=2)
text(sitexy, CompID, pos=1)
text(metxy, MetID, pos=1)
for(i in 1:nrow(subxy)) {
arrows(subxy[i,1], subxy[i,2], blockxy[r[i], 1], blockxy[r[i],2])}

Here is the output i got


Solution

  • I'm not sure if these two coordinates are correct. They appear as outliers (errors?) on your plot.

    metxy
            [,1]   [,2]
    [15,] -28.85 -28.74
    [16,] -28.38 -28.36
    [17,] -28.31  28.44 # <--- this one
    [18,] -28.20 -27.78
    [19,] -27.41 -27.39
    
    [33,] -28.89 -28.35
    [34,] -28.22  28.72 # <--- and this one.
    [35,]  28.90  28.90
    

    Anyway, it looks like there are two (or 3?) distinct regions, each containing 9 sites. You can visualise this with:

    plot(metxy)
    

    Region 1 contains 34 weather stations and region 2 contains 47. To better visualize the distances, you could either split the data by region, or use xlim and ylim to zoom in to the particular coordinates of interest.

    op <- par(mar=c(0,0,0,0), mfrow=c(1,2))
    
    plot(rbind(metxy, sitexy), xlab='longitude', ylab='latitude', las=1, 
            xlim=c(-27.75, -29.25), ylim=c(-27.75, -29.25))
    
    grid()
    points(metxy, col="red", pch=20)
    points(sitexy, col="blue", pch=20)
    text(sitexy, CompID, pos=1, cex=0.7, col="blue")
    text(metxy, MetID, pos=1, cex=0.6, col="red")
    legend("topleft", legend=c("Weather station", "Research site"), pch=20, col=c("red","blue"), title="Region 1", bty="n")
    
    
    for(i in seq_along(r)) {
      arrows(sitexy[i,1], sitexy[i,2], metxy[r[i], 1], metxy[r[i],2], length=0.1)
    }
    
    plot(rbind(metxy, sitexy), xlab='longitude', ylab='latitude', las=1, 
         xlim=c(31.5, 32.5), ylim=c(31.5, 32.5), yaxt="n")
    #etc....
    ...
    par(op)
    

    enter image description here

    Site A1 and A3 appear to have the same location and this is confirmed by inspecting the data, as do A6 and A9. The overlapping text is a small problem, but I am not familiar with any base functions that can alleviate this like the ggrepel package.