Search code examples
rggplot2alphageom-bardemographics

Ggplot2 in R gives incorrect coloring when creating overlapping demographic pyramids


I am creating an overlapping demographic pyramids in R with ggplot2 library to compare demographic data from two different sources.

I have however run in to problems with ggplot2 and the colouring when using the alpha-parameter. I have tried to make sense of ggplot2 and geom_bar structure, but so far it has gotten me nowhere. The deal is to draw four geom_bars where two geom_bars are overlapping each other (males and females, respectively). I'd have no problems if I didn't need use alpha to demonstrate differences in my data.

I would really appreciate some answers where I am going wrong here. As a R programmer I am pretty close to beginner, so bear with me if my code looks weird.

Below is my code which results in the image also shown below. I have altered my demographic data to be random for this question.

library(ggplot2)

# Here I randomise my data for StackOverflow
poptest<-data.frame(matrix(NA, nrow = 101, ncol = 5))
poptest[,1]<- seq(0,100)
poptest[,2]<- rpois(n = 101, lambda = 100)
poptest[,3]<- rpois(n = 101, lambda = 100)
poptest[,4]<- rpois(n = 101, lambda = 100)
poptest[,5]<- rpois(n = 101, lambda = 100)
colnames(poptest) <- c("age","A_males", "A_females","B_males", "B_females")

myLimits<-c(-250,250)
myBreaks<-seq(-250,250,50)

# Plot demographic pyramid
poptestPlot <- ggplot(data = poptest) +
  geom_bar(aes(age,A_females,fill="black"), stat = "identity", alpha=0.75, position = "identity")+
  geom_bar(aes(age,-A_males, fill="black"), stat = "identity", alpha=0.75, position="identity")+
  geom_bar(aes(age,B_females, fill="white"), stat = "identity", alpha=0.5, position="identity")+
  geom_bar(aes(age,-B_males, fill="white"), stat = "identity", alpha=0.5, position="identity")+
  coord_flip()+

  #set the y-axis which (because of the flip) shows as the x-axis
  scale_y_continuous(name = "",
                     limits = myLimits,
                     breaks = myBreaks,
                     #give the values on the y-axis a name, to remove the negatives
                     #give abs() command to remove negative values
                     labels = paste0(as.character(abs(myBreaks))))+

  #set the x-axis which (because of the flip) shows as the y-axis
  scale_x_continuous(name = "age",breaks=seq(0,100,5)) +
  #remove the legend
  theme(legend.position = 'none')+

  # Annotate geom_bars
  annotate("text", x = 100, y = -200, label = "males",size=6)+
  annotate("text", x = 100, y = 200, label = "females",size=6)

# show results in a separate window
x11()
print(poptestPlot) 

This is what I get as result: (sorry, as a StackOverflow noob I can't embed my pictures)

Ggplot2 result

The colouring is really nonsensical. Black is not black and white is not white. Instead it may use some sort of default coloring because R or ggplot2 can't interpret my code.

I welcome any and all answers. Thank you.


Solution

  • You are trying to map "black" to data points. That means you would have to add a manual scale and tell ggplot to colour each instance of "black" in colour "black". There is a shortcut for this called scale_colour_identity. However, if this is your only level, it is much easier to just use fill outside the aes. This way the whole geom is filled in black or white respectively:

    poptestPlot <- ggplot(data = poptest) +
      geom_bar(aes(age,A_females),fill="black", stat = "identity", alpha=0.75, position = "identity")+
      geom_bar(aes(age,-A_males), fill="black", stat = "identity", alpha=0.75, position="identity")+
      geom_bar(aes(age,B_females), fill="white", stat = "identity", alpha=0.5, position="identity")+
      geom_bar(aes(age,-B_males), fill="white", stat = "identity", alpha=0.5, position="identity")+
      coord_flip()+
    
      #set the y-axis which (because of the flip) shows as the x-axis
      scale_y_continuous(name = "",
                         limits = myLimits,
                         breaks = myBreaks,
                         #give the values on the y-axis a name, to remove the negatives
                         #give abs() command to remove negative values
                         labels = paste0(as.character(abs(myBreaks))))+
    
      #set the x-axis which (because of the flip) shows as the y-axis
      scale_x_continuous(name = "age",breaks=seq(0,100,5)) +
      #remove the legend
      theme(legend.position = 'none')+
    
      # Annotate geom_bars
      annotate("text", x = 100, y = -200, label = "males",size=6)+
      annotate("text", x = 100, y = 200, label = "females",size=6)
    

    enter image description here