Search code examples
rspatial

Make a 'fake' set of lat/long coordinates based on distances


I am hoping to build a matrix of 'fake' coordinates (i.e., they could be anywhere on Earth) based on their distances (in km or m) from one another.

I have a data frame containing distances between locations.

dist_df<- data.frame(site1=c("a","b","c","d"),
                     site2=c("b","c","d","a"),
                     distance = c(222.1, 672.4, 45.2, 65.4))

However, the actual coordinates have obstructions between them. Thus, I have circumvented the obstructions with another bit of code to obtain a least-cost distance. To run a series of later functions I require lat/long. I figured the easiest way to do this was to generate 'fake' coordinates based on their relative least-cost distances.

I am hoping to obtain something like the following:

coordinates_sites<-data.frame(site=c("a","b","c","d"), 
                              lat=c(34.5332,32.1232,30.4232,30.4232),
                              long=c(-120.2222,-125.4422,-123.3512,-122.4232))

Any thoughts on the best way to do this? Many thanks in advance!


Solution

  • Here is a solution using cmdscale. As cmdscale needs a complete distance matrix, I had to fill in the missing distances in the dataframe you provided:

    dist_df<- data.frame(site1=c("a","b","c","d", "d", "c"),
                         site2=c("b","c","d","a", "b", "a"),
                         distance = c(222.1, 672.4, 45.2, 65.4, 10.3, 110.1))
    
    dst <- dist_df  %>%
      {
        names <- union(.$site1, .$site2)
        m <- matrix(0,
                    nrow = length(names),
                    ncol = length(names),
                    dimnames = list(names, names))
        m[cbind(.$site1, .$site2)] <- pull(., distance)
        m[cbind(.$site2, .$site1)] <- pull(., distance)
        diag(m) <- 0
        as.dist(m)
      } 
    
    dst
    
    ##>       a     b     c
    ##> b 222.1            
    ##> c 110.1 672.4      
    ##> d  65.4  10.3  45.2
    

    The fake coordinates can be obtained by the following:

    cmdscale(dst)
    
    ##>        [,1]        [,2]
    ##>a  -19.88532  39.4181776
    ##>b  341.11969  -0.9109572
    ##>c -331.30946  -4.3427637
    ##>d   10.07509 -34.1644567
    
    

    Note that the solution returned by cmdscale is an optimization and does not preserve faithfully input distances:

    dist(cmdscale(dst))
    
    ##>           a         b         c
    ##> b 363.25068                    
    ##> c 314.48372 672.43790          
    ##> d  79.44829 332.71057 342.68461
    

    However, if you start from a matrix of actual distances, cmdscale can reconstruct accurately the coordinates. It is shown in the cmdscale example:

    loc <- cmdscale(eurodist)
    x <- loc[, 1]
    y <- -loc[, 2] # reflect so North is at the top
    ## note asp = 1, to ensure Euclidean distances are represented correctly
    plot(x, y, type = "n", xlab = "", ylab = "", asp = 1, axes = FALSE,
         main = "cmdscale(eurodist)")
    text(x, y, rownames(loc), cex = 0.6)
    
    

    enter image description here