Search code examples
rggplot2histogramgeom-barprobability-distribution

ggplot histogram with custom bin limits and counts


I am trying to create a custom histogram using ggplot2 but am stuck on how to plot the bins. I have 24 bins with the following bin limits and proportions of data falling within each bin.

bin_limits <- c(0, 0.01, 0.025, seq(0.05, 0.95, by = 0.05), 0.975, 0.99, 1)

props <- c(0.07700, 0.03275, 0.04300, 0.05850, 0.04775, 0.04375,
           0.03750, 0.03825, 0.03250, 0.03325, 0.04000, 0.03000, 
           0.02950, 0.02675, 0.03000, 0.02950, 0.03700, 0.03725, 
           0.04325, 0.04250, 0.05425, 0.04675, 0.03575, 0.07325)

What I want is a histogram showing boxes the width of the limits and the height of the proportions. I would prefer to do this using ggplot2, but I'd be okay with alternative methods. The closest thing I can come up with now is a scatter plot with proportions that can be made using the code below.

data.frame(limit = bin_limits[-1], prop = props) %>% 
  ggplot(aes(x = limit, y = prop)) +
  geom_point() +
  ylim(0,.1)

Solution

  • ## a vector with the center of each bin
    bin_ctr <- (bin_limits[-1] + bin_limits[-length(bin_limits)])/2
    ## a vector with the width of each bin
    bin_wdt <- (bin_limits[-1] - bin_limits[-length(bin_limits)])
    ## put everything in a dataframe
    bin_prop <- data.frame(bin_ctr, bin_wdt, props)
    
    library(ggplot2)
    
    ggplot(bin_prop, aes(x=bin_ctr, y=props, width=bin_wdt)) + 
      geom_bar(stat="identity", position="identity", 
               fill = "dodgerblue4", color = "white")
    

    Created on 2024-01-26 with reprex v2.0.2