Search code examples
rdataframevector

How to create a list of all possible vectors that are formed from a dataframe in R?


I have a data frame called indx like this;

C1 C2
1 5
6 1
3 7
3 1
5 3
7 6

And I want to create a list of all possible vectors that are formed according to the rule that, if the 2nd column of a row is equal to the 1st column of another row, then put them in order without repeat. For example, 5 is found both at [1,2] and [5,1] positions. Then one of the vectors should be (1, 5, 3)

So the desired output is like this;

[[1]]
[1] 1 5 3

[[2]]
[1] 6 1 5

[[3]]
[1] 3 7 6

[[4]]
[1] 3 1 5

[[5]]
[1] 5 3 7

[[6]]
[1] 5 3 1

[[7]]
[1] 7 6 1

[[8]]
[1] 1 5 3 7

[[9]]
[1] 1 5 3 7 6

[[10]]
[1] 6 1 5 3

[[11]]
[1] 6 1 5 3 7

[[12]]
[1] 3 7 6 1

I have wrote the code below, but it is very far from the desired result.

sol1 <- c()

for(i in seq_along(indx[,1])){
  for(j in seq_along(indx[,1])){
    for(k in seq_along(indx[,1])){
      if(indx[i,1] == indx[j,2]){
        sol1[k] <- indx[i,1];
        sol1[k+1] <- indx[i,2]
      }
    }
  }
}

I know it is a confusing task. Thank you in advance


Solution

  • You can reframe your problem by thinking of the numbers as nodes in a graph. Your data frame can be thought of the edgelist of a directed graph, and the result you are looking for is all simple paths (i.e. paths that don't repeat nodes) through the graph. You can easily do this using igraph:

    library(igraph)
    
    g <- graph_from_data_frame(indx)
    

    The graph looks like this:

    plot(g)
    

    You can get all the simple paths as follows (removing the length-2 paths that are equivalent to the individual rows in your data frame, since you seem not to want these in the output)

    res <- lapply(unique(as.character(indx[[1]])), function(x) {
      p <- all_simple_paths(g, x) |> 
        lapply(\(y) names(y) |> as.numeric())
      p[lengths(p) > 2]
      })
    
    res <- unique(do.call("c", res))
    

    The result looks like this (note that there are a few you have missed in your example output):

    res
    #> [[1]]
    #> [1] 1 5 3
    #> 
    #> [[2]]
    #> [1] 1 5 3 7
    #> 
    #> [[3]]
    #> [1] 1 5 3 7 6
    #> 
    #> [[4]]
    #> [1] 6 1 5
    #> 
    #> [[5]]
    #> [1] 6 1 5 3
    #> 
    #> [[6]]
    #> [1] 6 1 5 3 7
    #> 
    #> [[7]]
    #> [1] 3 1 5
    #> 
    #> [[8]]
    #> [1] 3 7 6
    #> 
    #> [[9]]
    #> [1] 3 7 6 1
    #> 
    #> [[10]]
    #> [1] 3 7 6 1 5
    #> 
    #> [[11]]
    #> [1] 5 3 1
    #> 
    #> [[12]]
    #> [1] 5 3 7
    #> 
    #> [[13]]
    #> [1] 5 3 7 6
    #> 
    #> [[14]]
    #> [1] 5 3 7 6 1
    #> 
    #> [[15]]
    #> [1] 7 6 1
    #> 
    #> [[16]]
    #> [1] 7 6 1 5
    #> 
    #> [[17]]
    #> [1] 7 6 1 5 3
    

    Created on 2023-10-09 with reprex v2.0.2