Search code examples
rmatrixdplyrplyr

Manipulating a matrix with conditions and concatenating the results


I've a matrix 8x8 matrix with cities and their distances from each other as under:

+--------------+------+--------+------+--------------+---------+------+------+----------+
|              | NYC  | BOSTON |  DC  | PHILADELPHIA | CHICAGO |  SF  |  LA  | SAN JOSE |
+--------------+------+--------+------+--------------+---------+------+------+----------+
| NYC          |    0 |    200 |  300 |          500 |     600 | 1500 | 1800 |     2000 |
| BOSTON       |  200 |      0 |  300 |          200 |     700 | 1600 | 1900 |     2100 |
| DC           |  300 |    300 |    0 |          250 |     550 | 1400 | 1850 |     2200 |
| PHILADELPHIA |  500 |    200 |  250 |            0 |     650 | 1300 | 1700 |     1900 |
| CHICAGO      |  600 |    700 |  550 |          650 |       0 | 1250 | 1600 |     1500 |
| SF           | 1500 |   1600 | 1400 |         1300 |    1250 |    0 |  300 |      400 |
| LA           | 1800 |   1900 | 1850 |         1700 |    1600 |  300 |    0 |      250 |
| SAN JOSE     | 2000 |   2100 | 2200 |         1900 |    1500 |  400 |  250 |        0 |
+--------------+------+--------+------+--------------+---------+------+------+----------+

I'm trying to filter the combinations where the distance is greater than 500 and then concatenate the results as under:

+--------------+---------------------------+---------------+
|     FROM     |            TO             |   DISTANCE    |
+--------------+---------------------------+---------------+
| NYC          | BOSTON, DC, PHILADELPHIA  | 200, 300, 500 |
| BOSTON       | NYC,DC, PHILADELPHIA      | 200, 300, 200 |
| DC           | NYC, BOSTON, PHILADELPHIA |  300,300, 250 |
| PHILADELPHIA | NYC,BOSTON, DC            | 500, 200, 250 |
| CHICAGO      |                           |               |
| SF           | LA, SAN JOSE              |      300, 400 |
| LA           | SF, SAN JOSE              |      300, 250 |
| SAN JOSE     | SF, LA                    |      400, 250 |
+--------------+---------------------------+---------------+

I found a similar example here:

https://stackoverflow.com/questions/20210787/r-getting-the-minimum-value-for-each-row-in-a-matrix-and-returning-the-row-and/20214579#20214579

and I know I can concatenate using the aggregate function

I came up with a usable solution but I was wondering if theres an easy way to achieve this

Below is my solution:

result <- t(sapply(seq(nrow(X)), function(i) {
  j <- which.min(X[i,])
  c(paste(rownames(X)[i], colnames(X)[j], sep='/////'), X[i,j])
}))

a<-data.frame(do.call('rbind', strsplit(as.character(result$col1),'/////',fixed=TRUE)), result$col2)

Solution

  • Here is the same with an other base R solution:

    res <- apply(df, 1, function(x) {
      data.frame(
        from = names(df)[x == 0], 
        to = paste0(names(df)[x <= 500 & x > 0], collapse = ", "),
        dist = paste0(x[x <= 500 & x > 0], collapse = ", ")
      )
    })
    
    do.call(rbind, res)
    

    Which results in

    #                     from                        to          dist
    # NYC                   NYC  BOSTON, DC, PHILADELPHIA 200, 300, 500
    # BOSTON             BOSTON     NYC, DC, PHILADELPHIA 200, 300, 200
    # DC                     DC NYC, BOSTON, PHILADELPHIA 300, 300, 250
    # PHILADELPHIA PHILADELPHIA           NYC, BOSTON, DC 500, 200, 250
    # CHICAGO           CHICAGO                                        
    # SF                     SF               LA, SANJOSE      300, 400
    # LA                     LA               SF, SANJOSE      300, 250
    # SANJOSE           SANJOSE                    SF, LA      400, 250