I have written a following code that is working and plotting what I want, but if I wanted to use ggplot for plotting, how could I have done it?
myseq<-seq(from = 1, to = 99, by = 5)
mtx <- array(rnorm(880,0,1) ,c(4,11,length(myseq)))
plot(myseq,mtx[1,1,],type = "l", col=1)
lines(myseq,mtx[1,2,],type = "l",col=2)
lines(myseq,mtx[1,3,],type = "l",col=3)
lines(myseq,mtx[1,4,],type = "l",col=4)
lines(myseq,mtx[1,5,],type = "l",col=5)
lines(myseq,mtx[1,6,],type = "l",col=6)
lines(myseq,mtx[1,7,],type = "l",col=7)
lines(myseq,mtx[1,8,],type = "l",col=8)
lines(myseq,mtx[1,9,],type = "l",col=9)
lines(myseq,mtx[1,10,],type = "l",col=10)
lines(myseq,mtx[1,11,],type = "l",col=11)
After running this say I get a plot like below, Now how to do this thing using ggplot2?
ggplot2
prefers data frames, and for something like this, frames in a long format.
Here's a basic way to do it.
First, reproducible random data.
set.seed(42)
myseq<-seq(from = 1, to = 99, by = 5)
mtx <- array(rnorm(880,0,1) ,c(4,11,length(myseq)))
mtx[1,1:4,1:4]
# [,1] [,2] [,3] [,4]
# [1,] 1.3709584 -1.3682810 0.9333463 6.288407e-05
# [2,] 0.4042683 -0.4314462 0.6503486 -1.173196e-01
# [3,] 2.0184237 1.5757275 -1.1317387 -8.610730e-02
# [4,] -1.3888607 0.6792888 1.2009654 -4.138688e-01
From here, we can basically convert this 3D array into a four column data.frame
, where three of the columns indicate the axes, and the fourth is the actual value.
melted_mtx <- reshape2::melt(mtx[1,,,drop=FALSE])
str(melted_mtx, vec.len=15)
# 'data.frame': 220 obs. of 4 variables:
# $ Var1 : int 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
# $ Var2 : int 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 ...
# $ Var3 : int 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 ...
# $ value: num 1.371 0.404 2.018 -1.389 -0.284 -0.307 1.895 0.46 1.035 -0.784 0.206 -1.368 -0.431 1.576 0.679 -0.367 -0.727 0.921 0.624 ...
Since you want to plot myseq
on the x-axis, we can do a replacement with that. It's in the third variable here, Var3
. I'll verify that it is the correct length and then do the replacement:
lapply(melted_mtx[-4], table)
# $Var1
# 1
# 220
# $Var2
# 1 2 3 4 5 6 7 8 9 10 11
# 20 20 20 20 20 20 20 20 20 20 20
# $Var3
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11
melted_mtx$Var3 <- myseq[melted_mtx$Var3]
lapply(melted_mtx[-4], table)
# $Var1
# 1
# 220
# $Var2
# 1 2 3 4 5 6 7 8 9 10 11
# 20 20 20 20 20 20 20 20 20 20 20
# $Var3
# 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
# 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11
We can use that and feed it to ggplot2
rather directly.
library(ggplot2)
ggplot(melted_mtx, aes(Var3, value, group = Var2, color = factor(Var2))) +
geom_line()
The base graphics plot of that (your code) renders:
which, margins and colors aside, is effectively the same plot. There is plenty of beautification that can be done, including legend labels, axes, etc.
Above, I only melt
ed the first plane of the array. If you did the whole array, then it'd look like this:
melted_mtx <- reshape2::melt(mtx)
melted_mtx$Var3 <- myseq[melted_mtx$Var3]
ggplot(melted_mtx, aes(Var3, value, group = interaction(Var1, Var2),
color = interaction(Var1, Var2))) +
geom_line()
This is a bit complex, clearly, but with the use of interaction
, you can group by multiple variables. Grouping is required here, since without it ggplot2
will try to connect all of the points in a single line, generally not useful. Usually one can get away with using just color=
to suggest the groups, but I often include both group=
and color=
in case I later change how colors/linetypes/shapes/... are defined, and groups are accidentally changed.