Search code examples
rggplot2plotstatisticshistogram

histogram with densities estimated by a model in ggplot2


I'm trying to make a plot in ggplot2 of the densities estimated by a model fitted in gamlss.

I performed this using R base, as shown below:

library(gamlss)
library(ggplot2)

data(Orange)

mod.g = gamlss(circumference ~ age, 
              family=GA, data = Orange)

pred.g <- predict(mod.g, type = "r")

shapex = (mean(pred.g)/sd(pred.g))^2
ratex = mean(pred.g)/sd(pred.g)^2


hist(Orange$circumference, freq = FALSE, breaks = seq(0, 240, 20))
curve(dgamma(x,
             shapex,
             ratex), add = T,col = "blue",lwd=2)
legend("topright", legend = c("Gamma"), lty = 1, col = "blue")

Result:

enter image description here

However, when I tried to perform this in ggplot2 the lines are not being plotted, see:

ggplot(Orange, aes(x = circumference)) +
  geom_histogram(color = "black", fill = "#225EA8", binwidth=30) +
  geom_line(aes(shapex, ratex)) +
  theme(legend.title = element_text(size = 15),
        legend.text = element_text(size = 17),
        axis.title = element_text(size = 22),
        axis.text.x = element_text(color = "black", hjust=1),
        axis.text.y = element_text(color = "black", hjust=1),
        axis.text = element_text(size = 15),
        strip.text.x = element_text(size = 18))

enter image description here


Solution

  • After_stat is necessary, but doesn't do the entire trick. With curve you are actually plotting a function. You are passing two constants to your geom_line - how are you expecting ggplot2 to know that you want to plot a gamma distribution with those two constants as parameter?

    For this, you could use stat_function

    ggplot(Orange, aes(x = circumference)) +
      geom_histogram(aes(y = after_stat(density)), color = "black", fill = "#225EA8", binwidth=30) +
      stat_function(fun = function(x) dgamma(x, shapex, ratex))  
    

    Created on 2023-02-15 with reprex v2.0.2