Search code examples
rggplot2histogramkernel-density

Kernel Density Plots and Histogram overlay


so I have found a way to over lay my KDE density function with my histogram using ggplot2, however what I've noticed is my histogram y axis is frequency which is correct, but I want to make a secondary y axis for my density plot, I also dont know how to scale up my density plot.

the code im using is:

data_set <- mammals

library(ggplot2)
ggplot(data=data_set, aes(data_set$`Total Averages`))+
  geom_histogram(col='black', fill = 'white', binwidth = 0.5)+
  labs(x = 'Log10 total body mass (kg)', y = 'Frequency', title = 'Average body mass (kg) of mammalian species (male and female)')+
  geom_density(col=2)

I have posted the link to the image below of what my plot looks like

enter image description here


Solution

  • Your histogram is plot using the count per bins of your data. To get the density being scaled you can change the representation of the density by passing y = ..count.. for example.

    If you want to represent the scale of this density (for example scaled to a maximum of 1), you can pass the sec.axis argument in scale_y_continuous (a lot of posts on SO have developed the use of this particular function) as follow:

    df <- data.frame(Total_average = rnorm(100,0,2)) # Dummy example
    
    library(ggplot2)
    ggplot(df, aes(Total_average))+
      geom_histogram(col='black', fill = 'white', binwidth = 0.5)+
      labs(x = 'Log10 total body mass (kg)', y = 'Frequency', title = 'Average body mass (kg) of mammalian species (male and female)')+
      geom_density(aes(y = ..count..), col=2)+
      scale_y_continuous(sec.axis = sec_axis(~./20, name = "Scaled Density"))
    

    and you get:

    enter image description here

    Does it answer your question ?