Search code examples
rbigdataigraphedge-list

Creating an edgelist with multiple columns and N/As in R


I am working on igraph in R.

I have a dataframe (df) with 29 columns - some rows which all have values and some which have NAs.

It looks a little something like this:

      V1 V2 V3 V4
   1   1  2  3  NA
   2   2  3  NA NA
   3   2  4  1  NA
   4   1 NA  NA NA

but much larger. I am having trouble creating an edgelist from this data and have tried:

myPairs <- apply(t(df), 2, function(x) t(combn(x[!is.na(x)], 2)))

but keep getting this error:

Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 't': n < m

The output should look like this:

      col1   col2
   1  1      2
   2  1      3
   3  2      3
   4  2      3
   5  2      4
   6  2      1
   7  1      4

Any help would be very much appreciated!


Solution

  • Here is one approach.

    Making sure your data.frame is numeric:

    df <- sapply(df, as.numeric)
    

    You can use apply with combn as you've done, but first use na.omit to remove missing values. You can also check the length so that if you only have one value in a row you skip it.

    do.call(rbind, apply(df, 1, function(x) {
      y <- na.omit(x)
      if (length(y) > 1)
        t(combn(y, 2))
    }))
    

    Output

         [,1] [,2]
    [1,]    1    2
    [2,]    1    3
    [3,]    2    3
    [4,]    2    3
    [5,]    2    4
    [6,]    2    1
    [7,]    4    1