I have a dataframe with values corresponding to two separate groups evaluated over time. Mock data below:
Gene Name. Sample S1. Sample S2. Sample S3. Sample R1. Sample R2. Sample R3.
Gene 1 4 5 3 3 39 44
Gene 2 4 100 33 3 32 14
I melted my dataframe and compiled summary stats using the summarySE function. I then plotted my data using the following script:
plot = ggplot(tgastats2, aes(x=Gene Name, y=value, fill=Sample))
+ geom_bar(position=position_dodge(), stat="identity") +
+ geom_errorbar(aes(ymin=value-se, ymax=value+se),
+ width=.2,
+ position=position_dodge(.9))
What I would like to do is plot the values of S1-3 grouped together and R1-3 on the same plot separated with some space. Any help would be appreciated.
Here's the data in a reproducible way:
df <- data.frame(
Gene_name=c('Gene 1', 'Gene 2'),
Sample.S1=c(4,4), Sample.S2=c(5,100), Sample.S3=c(3,33),
Sample.R1=c(3,3), Sample.R2=c(39,32), Sample.R3=c(44,14)
)
Now, for a solution. As you indicated, we need to "melt" the dataset. My preference is to use gather()
from dplyr
, but melt()
works in a similar manner:
df1 <- df %>% gather(key='Sample', value='value', -Gene_name)
In order for ggplot2
to know that you want to group it in the manner you indicate, you will need to categorize the data. R and ggplot
are not smart enough to understand S1, S2, and S3 belong together, so you have to tell R how that can be done. There are likely a lot of ways to separate and categorize. Without seeing your actual melted df, tgastats2
, I'll have to assume it's similar to the example posted. I'm going to use the fact that all samples R1-R3 contain a capital "R", whereas the others do not:
df1$my_group <- ifelse(grepl('R',df1$Sample),'R','S')
Then you can plot:
ggplot(df1, aes(x=Gene_name, y=value, fill=my_group)) +
geom_col(position='dodge', color='black')
Hm... that doesn't look right. What's going on? Well, ggplot
is separating based on df1$my_group
, but there are 3 values in each of those groups. You can separate those out by using the group=
aesthetic in addition to the fill=
aesthetic and ggplot
will separate them out completely:
ggplot(df1, aes(x=Gene_name, y=value, fill=my_group, group=Sample)) +
geom_col(position='dodge', color='black')