Search code examples
rggplot2histogram

Using ggplot histogram instead of hist function in R


I am using a package called BetaMixture in R to fit a mixture of beta distributions for a data vector. The output is supplied to a hist that produces a good histogram with the mixture model components:

# Install and load the libraries

#install.packages("BetaModels")
library(BetaModels)

# Create a vector, fit mixture models and plot the histogram

vec <- c(rbeta(700, 5, 2), rbeta(300, 1, 10))
model <- BetaMixture(vec,2)
h <- hist(model, breaks = 35)

enter image description here

So far so good. Now how do I get this in ggplot? I inspected the h object but that is not different from the model object. They are exactly the same. I don't know how this hist even works for this class. What does it pull from the model to generate this plot other than the @datavec?


Solution

  • You can get the hist function for BetaMixed objects using getMethod("hist", "BetaMixture").
    Below you can find a simple translation of this function into the "ggplot2 world".

    myhist <- function (x, ...) {
        .local <- function (x, mixcols = 1:7, breaks=25, ...) 
        {
            df1 <- data.frame(x=x@datavec)
            p <- ggplot(data=df1, aes(x=x)) + 
                 geom_histogram(aes(y=..density..), bins=breaks, alpha=0.5, fill="gray50", color="black")
            while (length(mixcols) < ncol(x@mle)) mixcols <- c(mixcols, 
                mixcols)
            xv <- seq(0, 1, length = 502)[1:501]
            for (J in 1:ncol(x@mle)) {
                y <- x@phi[J] * dbeta(xv, x@mle[1, J], x@mle[2, J])
                df2 <- data.frame(xv, y)
                p <- p + geom_line(data=df2, aes(xv, y), size=1, col=mixcols[J])
            }
            p <- p + theme_bw()
            invisible(p)
        }
        .local(x, ...)
    }
    
    library(ggplot2)
    # Now p is a ggplot2 object.
    p <- myhist(model, breaks=35)
    print(p)
    

    enter image description here