Search code examples
rpivot-tabletidyrreshape2melt

Seeking R function to melt 5-dimensional array, like pivot_longer


I have a program that uses reshape2's melt function to melt a 5-dimensional array with named and labelled dimensions to a long-form data frame, which by definition has only two dimensions. Each dimension of the input array corresponds to a column in the output data frame, and there is one more column that holds the values that were stored in the 5D array.

I understand reshape2 is deprecated and will soon break. So I am changing to tidyr. However tidyr's pivot_longer function that replaces melt only accepts 2D data frames as inputs.

Is there a non-deprecated function, in tidyr or elsewhere, that will melt an array with 3 or more named and labelled dimensions to a long form data frame?

I could write my own function to do it easily enough. But I'd rather use an existing function if there is one.

Thank you

Here's an example of 2x3x4 array:

df <- expand.grid(w = 1:2,
                  x = 1:3,
                  y = 1:4)
df$z <- runif(nrow(df))

tmp <- tapply(df$z, list(df$w, df$x, df$y), sum)
tmp
, , 1

           1          2         3
1 0.40276418 0.13111652 0.4473557
2 0.08945365 0.03139184 0.1556355

, , 2

          1          2         3
1 0.1413763 0.02106974 0.1103559
2 0.7302435 0.46302772 0.7924580

, , 3

          1         2         3
1 0.2793435 0.4244807 0.7955351
2 0.9828739 0.7740189 0.6436733

, , 4

          1          2         3
1 0.9852345 0.20508490 0.8744829
2 0.2812744 0.06272449 0.0936831

Solution

  • Sticking with base R, you can wrap your array in ftable before using as.data.frame:

    set.seed(1); array(sample(100, 2*3*4, TRUE), dim = c(2, 3, 4)) -> a
    b <- provideDimnames(a)
    b
    # , , A
    # 
    #    A  B  C
    # A 27 58 21
    # B 38 91 90
    # 
    # , , B
    # 
    #    A  B  C
    # A 95 63 21
    # B 67  7 18
    # 
    # , , C
    # 
    #    A  B   C
    # A 69 77  72
    # B 39 50 100
    # 
    # , , D
    # 
    #    A  B  C
    # A 39 94 66
    # B 78 22 13
    
    as.data.frame(ftable(b))
    #    Var1 Var2 Var3 Freq
    # 1     A    A    A   27
    # 2     B    A    A   38
    # 3     A    B    A   58
    # 4     B    B    A   91
    # 5     A    C    A   21
    # 6     B    C    A   90
    # 7     A    A    B   95
    # 8     B    A    B   67
    # 9     A    B    B   63
    # 10    B    B    B    7
    # 11    A    C    B   21
    # 12    B    C    B   18
    # 13    A    A    C   69
    # 14    B    A    C   39
    # 15    A    B    C   77
    # 16    B    B    C   50
    # 17    A    C    C   72
    # 18    B    C    C  100
    # 19    A    A    D   39
    # 20    B    A    D   78
    # 21    A    B    D   94
    # 22    B    B    D   22
    # 23    A    C    D   66
    # 24    B    C    D   13
    

    You can also use as.data.table from the "data.table" package. The following should work:

    library(data.table)
    as.data.table(b)