Make a 'fake' set of lat/long coordinates based on distances

I am hoping to build a matrix of 'fake' coordinates (i.e., they could be anywhere on Earth) based on their distances (in km or m) from one another.

I have a data frame containing distances between locations.

dist_df<- data.frame(site1=c("a","b","c","d"),
                     site2=c("b","c","d","a"),
                     distance = c(222.1, 672.4, 45.2, 65.4))

However, the actual coordinates have obstructions between them. Thus, I have circumvented the obstructions with another bit of code to obtain a least-cost distance. To run a series of later functions I require lat/long. I figured the easiest way to do this was to generate 'fake' coordinates based on their relative least-cost distances.

I am hoping to obtain something like the following:

coordinates_sites<-data.frame(site=c("a","b","c","d"), 
                              lat=c(34.5332,32.1232,30.4232,30.4232),
                              long=c(-120.2222,-125.4422,-123.3512,-122.4232))

Any thoughts on the best way to do this? Many thanks in advance!

Solution

Here is a solution using cmdscale. As cmdscale needs a complete distance matrix, I had to fill in the missing distances in the dataframe you provided:

dist_df<- data.frame(site1=c("a","b","c","d", "d", "c"),
                     site2=c("b","c","d","a", "b", "a"),
                     distance = c(222.1, 672.4, 45.2, 65.4, 10.3, 110.1))

dst <- dist_df  %>%
  {
    names <- union(.$site1, .$site2)
    m <- matrix(0,
                nrow = length(names),
                ncol = length(names),
                dimnames = list(names, names))
    m[cbind(.$site1, .$site2)] <- pull(., distance)
    m[cbind(.$site2, .$site1)] <- pull(., distance)
    diag(m) <- 0
    as.dist(m)
  } 

dst

##>       a     b     c
##> b 222.1            
##> c 110.1 672.4      
##> d  65.4  10.3  45.2

The fake coordinates can be obtained by the following:

cmdscale(dst)

##>        [,1]        [,2]
##>a  -19.88532  39.4181776
##>b  341.11969  -0.9109572
##>c -331.30946  -4.3427637
##>d   10.07509 -34.1644567

Note that the solution returned by cmdscale is an optimization and does not preserve faithfully input distances:

dist(cmdscale(dst))

##>           a         b         c
##> b 363.25068                    
##> c 314.48372 672.43790          
##> d  79.44829 332.71057 342.68461

However, if you start from a matrix of actual distances, cmdscale can reconstruct accurately the coordinates. It is shown in the cmdscale example:

loc <- cmdscale(eurodist)
x <- loc[, 1]
y <- -loc[, 2] # reflect so North is at the top
## note asp = 1, to ensure Euclidean distances are represented correctly
plot(x, y, type = "n", xlab = "", ylab = "", asp = 1, axes = FALSE,
     main = "cmdscale(eurodist)")
text(x, y, rownames(loc), cex = 0.6)