I've a matrix 8x8 matrix with cities and their distances from each other as under:
+--------------+------+--------+------+--------------+---------+------+------+----------+
| | NYC | BOSTON | DC | PHILADELPHIA | CHICAGO | SF | LA | SAN JOSE |
+--------------+------+--------+------+--------------+---------+------+------+----------+
| NYC | 0 | 200 | 300 | 500 | 600 | 1500 | 1800 | 2000 |
| BOSTON | 200 | 0 | 300 | 200 | 700 | 1600 | 1900 | 2100 |
| DC | 300 | 300 | 0 | 250 | 550 | 1400 | 1850 | 2200 |
| PHILADELPHIA | 500 | 200 | 250 | 0 | 650 | 1300 | 1700 | 1900 |
| CHICAGO | 600 | 700 | 550 | 650 | 0 | 1250 | 1600 | 1500 |
| SF | 1500 | 1600 | 1400 | 1300 | 1250 | 0 | 300 | 400 |
| LA | 1800 | 1900 | 1850 | 1700 | 1600 | 300 | 0 | 250 |
| SAN JOSE | 2000 | 2100 | 2200 | 1900 | 1500 | 400 | 250 | 0 |
+--------------+------+--------+------+--------------+---------+------+------+----------+
I'm trying to filter the combinations where the distance is greater than 500 and then concatenate the results as under:
+--------------+---------------------------+---------------+
| FROM | TO | DISTANCE |
+--------------+---------------------------+---------------+
| NYC | BOSTON, DC, PHILADELPHIA | 200, 300, 500 |
| BOSTON | NYC,DC, PHILADELPHIA | 200, 300, 200 |
| DC | NYC, BOSTON, PHILADELPHIA | 300,300, 250 |
| PHILADELPHIA | NYC,BOSTON, DC | 500, 200, 250 |
| CHICAGO | | |
| SF | LA, SAN JOSE | 300, 400 |
| LA | SF, SAN JOSE | 300, 250 |
| SAN JOSE | SF, LA | 400, 250 |
+--------------+---------------------------+---------------+
I found a similar example here:
https://stackoverflow.com/questions/20210787/r-getting-the-minimum-value-for-each-row-in-a-matrix-and-returning-the-row-and/20214579#20214579
and I know I can concatenate using the aggregate function
I came up with a usable solution but I was wondering if theres an easy way to achieve this
Below is my solution:
result <- t(sapply(seq(nrow(X)), function(i) {
j <- which.min(X[i,])
c(paste(rownames(X)[i], colnames(X)[j], sep='/////'), X[i,j])
}))
a<-data.frame(do.call('rbind', strsplit(as.character(result$col1),'/////',fixed=TRUE)), result$col2)
Here is the same with an other base R solution:
res <- apply(df, 1, function(x) {
data.frame(
from = names(df)[x == 0],
to = paste0(names(df)[x <= 500 & x > 0], collapse = ", "),
dist = paste0(x[x <= 500 & x > 0], collapse = ", ")
)
})
do.call(rbind, res)
Which results in
# from to dist
# NYC NYC BOSTON, DC, PHILADELPHIA 200, 300, 500
# BOSTON BOSTON NYC, DC, PHILADELPHIA 200, 300, 200
# DC DC NYC, BOSTON, PHILADELPHIA 300, 300, 250
# PHILADELPHIA PHILADELPHIA NYC, BOSTON, DC 500, 200, 250
# CHICAGO CHICAGO
# SF SF LA, SANJOSE 300, 400
# LA LA SF, SANJOSE 300, 250
# SANJOSE SANJOSE SF, LA 400, 250