I have two different datasets with different numbers of observations. I would like to plot two boxplots on the same graph so it's easier to make comparison. I could plot one boxplot, but that would be hard to spot any discrepancy without them side by side.
I have some fake data.
Group A
V1 V2 V3 V4 V5
6.5 2 11 0.5 6
7 1 8 0.34 8
5.4 4 7.8 0.45 5
3.4 6 9.1 0.72 5
Group B
V1 V2 V3 V4 V5
5.0 5 9 0.4 7
2 7 5.2 0.69 5
3.2 2 2.9 0.79 2
6.8 9 6.5 0.43 6
4.7 3 3.8 0.49 4
5.5 4 7.4 0.94 3
I don't know how to graph this, so I don't have an example. I'll try my best to describe the plot. I would like to plot Variable 1 for Group A and B on the same graph. So on one graph, I would have a boxplot for Group A and another boxplot for Group B filled with data from V1. So the two boxplots would be side by side. With 5 variables, I would have 5 graphs, each with 2 boxplots side by side. Please let me know if I'm not being clear. Thank you.
ggplot
works best with "long format" data (e.g., with a column for each of value, variable, and group). You can rearrange your data as follows:
A <- read.table(text='V1 V2 V3 V4 V5
6.5 2 11 0.5 6
7 1 8 0.34 8
5.4 4 7.8 0.45 5
3.4 6 9.1 0.72 5', header=TRUE)
B <- read.table(text='V1 V2 V3 V4 V5
5.0 5 9 0.4 7
2 7 5.2 0.69 5
3.2 2 2.9 0.79 2
6.8 9 6.5 0.43 6
4.7 3 3.8 0.49 4
5.5 4 7.4 0.94 3', header=TRUE)
d <- rbind(cbind(stack(A), group='A'), cbind(stack(B), group='B'))
The first few rows look like this:
head(d)
## values ind group
## 1 6.5 V1 A
## 2 7.0 V1 A
## 3 5.4 V1 A
## 4 3.4 V1 A
## 5 2.0 V2 A
## 6 1.0 V2 A
Now we can plot like so:
library(ggplot2)
ggplot(d, aes(group, values)) +
geom_boxplot() +
facet_wrap(~ind, scales='free_y')