Search code examples
rggplot2positionscatter-plotfill

Manual color and conditional fill without overriding position_dodge in geom_point?


I am trying to create a scatterplot with connected dots. As I have several overlapping data points, I used position=position_dodge to separate them visually. At the same time, I am colouring the dots and lines with a preset vector of colours. I am also trying to fill some dots with black, using a condition based on a factor. My problem is that when I try to add the filling condition, the dodging of dots gets messed up, as shown below:

see these example graphs

Here is how these plots can be made:

# Creating an example dataframe
id<- as.factor(rep(seq(from=1, to=6), times=4))
state <- rep(c("a", "b", "c", "d"), each=6)
time<-rep(seq(from=3.5, to=9, 0.5), each=2)
yesorno<-rep(c("y", "n", "n"), times=8) # condition for fill
sex<-rep(c(rep("m", times=3), rep("f", times=3)), times=4)

d<-data.frame(id, state, time, yesorno, sex)
d$sex_id <- paste(d$sex, d$id, sep="_") # allows to use two color scales on single plot (orders males and females in alphabetical order)

m <- scales::seq_gradient_pal("lightcyan2", "midnightblue", "Lab")(seq(0,1,length.out = 3)) # used for three male individuals
f<-scales::seq_gradient_pal("burlywood1", "red4", "Lab")(seq(0,1,length.out = 3)) # used for three female individuals
fm<-c(f, m)

ggplot()+
  geom_point(data=d, aes(x=factor(state), y=time, fill= factor(yesorno), color=factor(sex_id)), shape=21, size=3, position=position_dodge(width=0.3))+ # if "fill=..." is removed, then dodging works
  geom_line(data=d, size=0.7, aes(x=factor(state), y=time, color=factor(sex_id), group=factor(sex_id)), position=position_dodge(width=0.3)) +
  scale_color_manual(values=fm)+
  scale_fill_manual(values=c("white", "black"))

Solution

  • I think you just need to move the group aesthetic into the main call to ggplot so that it will apply to both the point and line geoms. Applying the grouping only to the line geom what was causing the dodging to be applied inconsistently. I've also moved a few other portions of the code into the main call to ggplot to avoid having to repeat them in each geom.

    pd = position_dodge(0.3)
    
    ggplot(d, aes(x=factor(state), y=time, color=factor(sex_id), group=factor(sex_id)))+
      geom_point(aes(fill=factor(yesorno)), shape=21, size=3, position=pd) + 
      geom_line(size=0.7, position=pd) +
      scale_color_manual(values=fm)+
      scale_fill_manual(values=c("white", "black")) +
      labs(colour="Sex_ID", fill="") +
      theme_classic()
    

    enter image description here

    One other thing is that you don't need to create a separate sex_id column if you don't want to. You can instead use the interaction function to combine sex and id on the fly. Although in that case you'll also want to create a named vector of colors to ensure that color and sex_id are matched up the way you want them:

    fm = setNames(c(m, f), unique(interaction(d$sex, d$id, sep="_")))
    
    ggplot(d, aes(x=factor(state), y=time, color=interaction(sex, id, sep="_", lex.order=TRUE), 
                  group=interaction(sex, id, sep="_", lex.order=TRUE))) +
      geom_point(aes(fill=factor(yesorno)), shape=21, size=3, position=pd) + 
      geom_line(size=0.7, position=pd) +
      scale_color_manual(values=fm)+
      scale_fill_manual(values=c("white", "black")) +
      labs(colour="Sex_ID", fill="") +
      theme_classic()