Search code examples
rlistggplot2plotline

ggplot for objects stored in an 3D array


I have written a following code that is working and plotting what I want, but if I wanted to use ggplot for plotting, how could I have done it?

myseq<-seq(from = 1, to = 99, by = 5)
mtx <- array(rnorm(880,0,1) ,c(4,11,length(myseq)))

plot(myseq,mtx[1,1,],type = "l", col=1) 
lines(myseq,mtx[1,2,],type = "l",col=2)
lines(myseq,mtx[1,3,],type = "l",col=3)
lines(myseq,mtx[1,4,],type = "l",col=4)
lines(myseq,mtx[1,5,],type = "l",col=5)
lines(myseq,mtx[1,6,],type = "l",col=6)
lines(myseq,mtx[1,7,],type = "l",col=7)
lines(myseq,mtx[1,8,],type = "l",col=8)
lines(myseq,mtx[1,9,],type = "l",col=9)
lines(myseq,mtx[1,10,],type = "l",col=10)
lines(myseq,mtx[1,11,],type = "l",col=11)

After running this say I get a plot like below, Now how to do this thing using ggplot2?

enter image description here

enter image description here


Solution

  • ggplot2 prefers data frames, and for something like this, frames in a long format.

    Here's a basic way to do it.

    First, reproducible random data.

    set.seed(42)
    myseq<-seq(from = 1, to = 99, by = 5)
    mtx <- array(rnorm(880,0,1) ,c(4,11,length(myseq)))
    mtx[1,1:4,1:4]
    #            [,1]       [,2]       [,3]          [,4]
    # [1,]  1.3709584 -1.3682810  0.9333463  6.288407e-05
    # [2,]  0.4042683 -0.4314462  0.6503486 -1.173196e-01
    # [3,]  2.0184237  1.5757275 -1.1317387 -8.610730e-02
    # [4,] -1.3888607  0.6792888  1.2009654 -4.138688e-01
    

    From here, we can basically convert this 3D array into a four column data.frame, where three of the columns indicate the axes, and the fourth is the actual value.

    melted_mtx <- reshape2::melt(mtx[1,,,drop=FALSE])
    str(melted_mtx, vec.len=15)
    # 'data.frame': 220 obs. of  4 variables:
    #  $ Var1 : int  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
    #  $ Var2 : int  1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 ...
    #  $ Var3 : int  1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 ...
    #  $ value: num  1.371 0.404 2.018 -1.389 -0.284 -0.307 1.895 0.46 1.035 -0.784 0.206 -1.368 -0.431 1.576 0.679 -0.367 -0.727 0.921 0.624 ...
    

    Since you want to plot myseq on the x-axis, we can do a replacement with that. It's in the third variable here, Var3. I'll verify that it is the correct length and then do the replacement:

    lapply(melted_mtx[-4], table)
    # $Var1
    #   1 
    # 220 
    # $Var2
    #  1  2  3  4  5  6  7  8  9 10 11 
    # 20 20 20 20 20 20 20 20 20 20 20 
    # $Var3
    #  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 
    # 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 
    
    melted_mtx$Var3 <- myseq[melted_mtx$Var3]
    lapply(melted_mtx[-4], table)
    # $Var1
    #   1 
    # 220 
    # $Var2
    #  1  2  3  4  5  6  7  8  9 10 11 
    # 20 20 20 20 20 20 20 20 20 20 20 
    # $Var3
    #  1  6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 
    # 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 
    

    We can use that and feed it to ggplot2 rather directly.

    library(ggplot2)
    ggplot(melted_mtx, aes(Var3, value, group = Var2, color = factor(Var2))) +
      geom_line()
    

    ggplot2 of plane 1

    The base graphics plot of that (your code) renders:

    base graphics

    which, margins and colors aside, is effectively the same plot. There is plenty of beautification that can be done, including legend labels, axes, etc.

    Above, I only melted the first plane of the array. If you did the whole array, then it'd look like this:

    melted_mtx <- reshape2::melt(mtx)
    melted_mtx$Var3 <- myseq[melted_mtx$Var3]
    ggplot(melted_mtx, aes(Var3, value, group = interaction(Var1, Var2),
                           color = interaction(Var1, Var2))) +
      geom_line()
    

    ggplot2 with all planes

    This is a bit complex, clearly, but with the use of interaction, you can group by multiple variables. Grouping is required here, since without it ggplot2 will try to connect all of the points in a single line, generally not useful. Usually one can get away with using just color= to suggest the groups, but I often include both group= and color= in case I later change how colors/linetypes/shapes/... are defined, and groups are accidentally changed.