I have data which I'd like to plot using ggplot
's geom_point
:
set.seed(1)
df <- data.frame(x=rnorm(100),y=rnorm(100),val=c(rnorm(90),rep(NA,10)))
I add colors according to intervals of df$val
:
intervals.df <- data.frame(interval=c("(-3,-2]","(-2,-0.999]","(-0.999,0]","(0,1.96]","(1.96,3.91]","(3.91,5.87]","not expressed"),
start=c(-3,-2,-0.999,0,1.96,3.91,NA),end=c(-2,-0.999,0,1.96,3.91,5.87,NA),
col=c("#2f3b61","#436CE8","#E0E0FF","#7d4343","#C74747","#EBCCD6","#D3D3D3"),stringsAsFactors=F)
df <- cbind(df,do.call(rbind,lapply(df$val,function(x){
if(is.na(x)){
return(data.frame(col=intervals.df$col[nrow(intervals.df)],interval=intervals.df$interval[nrow(intervals.df)]))
} else{
idx <- which(intervals.df$start <= x & intervals.df$end >= x)
return(data.frame(col=intervals.df$col[idx],interval=intervals.df$interval[idx]))
}
})))
Here I set df$col
as factor
and set the labels to be the intervals so I can plot them in the legend:
df$col <- factor(df$col,levels=intervals.df$col,labels=intervals.df$interval)
This will also display all the intervals including those that the df$val
might not cover, but I want that.
And here's how I try to plot it:
library(ggplot2)
ggplot(df,aes(x=x,y=y,colour=col))+geom_point(cex=2,shape=1,stroke=1)+labs(x="X",y="Y")+theme_bw()+theme(legend.key=element_blank(),panel.border=element_blank(),strip.background=element_blank())+scale_shape(solid=T)+scale_fill_manual(drop=FALSE,values=levels(df$col),name="DE")
Which gets me close but the colors are not right:
So I thought this plot command will correct that (adding scale_color_manual
):
ggplot(df,aes(x=x,y=y,colour=col))+geom_point(cex=2,shape=1,stroke=1)+labs(x="X",y="Y")+theme_bw()+theme(legend.key=element_blank(),panel.border=element_blank(),strip.background=element_blank())+scale_shape(solid=T)+scale_fill_manual(drop=FALSE,values=levels(df$col),name="DE")+scale_color_manual(drop=FALSE,values=levels(df$col),name="DE")
But that throws the error:
Error in grDevices::col2rgb(colour, TRUE) : invalid color name '(0,1.96]'
So, how to I get the colors right (and the legend name
right too)?
One option is map the colors to interval
after setting the levels via intervals.df
so the order of the levels and the number of levels is correct. Use the colors from intervals.df
, making a named vector of the colors to pass to scale_color_manual
.
# Set levels of interval via intervals.df
df$interval = factor(df$interval, levels=intervals.df$interval)
# Named vector of the colors based on intervals.df
colors = intervals.df$col
names(colors) = intervals.df$interval
ggplot(df, aes(x=x, y=y, colour=interval))+
geom_point(cex=2, shape=1, stroke=1) +
labs(x="X", y="Y")+
theme_bw()+
theme(legend.key=element_blank(),
panel.border=element_blank(), strip.background=element_blank())+
scale_color_manual(values = colors, name = "DE", drop = FALSE)