Search code examples
rggplot2density-plot

Is there a way to separate a density plot by color at a value that does not split the data into two separate groups?


I am trying to create a density plot that changes color at the average value of variable. However, when I create the split instead of maintaining its from as one distinct density chart it splits into two different ones at that point. I want to illustrate the point as an average, but with the split it creates the illusion of two different peaks which there are not.

This is my code as of the moment which produces this.

ggplot(FY23, aes(x=AV, y=..density..))+
  geom_density(aes(fill=AV<602226.34))+
  labs(x = "House Value", y = "Count", title = "Frequency of Housing Values")+
  scale_x_continuous(breaks =c(250000, 500000,750000,1000000,1250000), labels= c("$250,000","$500,000","$750,000","$1,000,000","$1,250,000"),
                     limits = c(50000, 1250000))+
  scale_y_continuous(breaks = c(0,0.000002,0.000004,0.000006),labels = c("0","500","1,000","1,500"))+
  geom_vline(xintercept=576200, linetype = "dashed")+
  
  annotate(x=576200,y=+Inf,label="Median",vjust=4,geom="label")+
  scale_fill_discrete(name = "FY 23 Average Home Value",labels = c("Above Average", "Below Average"))+
  theme_minimal()

Two peak density chart with average

I want it to maintain the one peak I had before like this but with two different colors to distinguish those below and above the average. Singular peak density chart no average


Solution

  • You can use the data frame produced by as.data.frame(density(FY23$AV)) and plot a geom_ribbon:

    library(ggplot2)
    
    within(as.data.frame(density(FY23$AV)), group <- x > median(FY23$AV)) |>
      ggplot(aes(x, y, fill = group)) +
      geom_ribbon(aes(ymin = 0, ymax = y), alpha = 0.5) +
      geom_line() +
      geom_vline(xintercept = median(FY23$AV), linetype = 2) +
      scale_fill_manual(NULL, values = c('red3', 'green4'), 
                        labels = c('low', 'high')) +
      scale_y_continuous('Count', labels = ~ .x * 1e9/2) +
      scale_x_continuous('House Price', labels = scales::dollar) +
      theme_minimal(base_size = 16)
    

    enter image description here


    Data used

    set.seed(1)
    
    FY23 <- data.frame(AV = rnorm(100, 6e5, 1e5))