I am plotting and saving multiple plots using ggplot in a for loop, which loops through a list. Sample list elements look like this:
head(genes[[1]])
name fc fdr
gene1 -2.0143529 0.0002
protein1 -3.2256188 0.0001
I used the following code to plot:
p<-list()
for(i in 1:length(genes)){
p[[i]]<-ggplot(genes[[i]], aes(name,fc, label= name))+
geom_col(aes(fill=factor(fdr < 0.05)), position = "dodge", width = 1)+
coord_flip()+scale_fill_manual(values = c("#00BFC4","#F8766D"))
Omitting the additional code that deals with formatting. The final plots looks like this:
Now, the problem is that I am plotting over a 100 of these plots using the for loop and I would like to always keep the gene (first row in the list element) to be the top bar and the protein to be the bottom bar since I have omitted all labels and plan to use a single legend for all the plots. However, ggplot keeps switching the order of the bars among different plots. Is there a way to make sure that that doesn't happen and the gene always stays on top? Thanks, Shash
ggplot
will order the bars alphabetically by name.
For example
library(ggplot2)
df <- data.frame(name = c("A-gene", "B-protein"), fc = c(-2.2, -3.2), fdr = c(0.2, 0.003))
ggplot(df, aes(name,fc, label= name))+
geom_col(aes(fill=factor(fdr < 0.05)), position = "dodge", width = 1)+
coord_flip()+scale_fill_manual(values = c("#00BFC4","#F8766D"))
dev.off()
To get the bars in the order you require, make the name variable a factor, and set the order of levels. With just two names, you can use relevel
as follows
library(ggplot2)
df <- data.frame(name = c("A-gene", "B-protein"), fc = c(-2.2, -3.2), fdr = c(0.2, 0.003))
df$name <- relevel(df$name, as.character(df$name[2]))
ggplot(df, aes(name,fc, label= name))+
geom_col(aes(fill=factor(fdr < 0.05)), position = "dodge", width = 1)+
coord_flip()+scale_fill_manual(values = c("#00BFC4","#F8766D"))
dev.off()
This makes the second row (protein) the bar next to the origin.