Search code examples
rggplot2histogram

Find binwidth of Histogram plotted using ggplot2


I have plotted a simple histogram using ggplot without using the binwidth attribute. However, I want to see the value of binwidth of my histogram.

set.seed(1234)
df <- data.frame(
  sex=factor(rep(c("F", "M"), each=200)),
  weight=round(c(rnorm(200, mean=55, sd=5), rnorm(200, mean=65, sd=5)))
  )
head(df)

library(ggplot2)

ggplot(df, aes(x=weight)) + geom_histogram()

How can I view this value?

THanks


Solution

  • Here are two ways.

    1. With ggplot_build create an list object. Its 1st member has a data.frame data with xmin and xmax. The differences between these values is the binwidth;
    2. With layer_data the process above is more direct. It extracts the data.frame and the rest is the same.

    The return value of unique is not a vector of length 1 due to floating-point precision issues. Any value can be used.

    set.seed(1234)
    df <- data.frame(
      sex=factor(rep(c("F", "M"), each=200)),
      weight=round(c(rnorm(200, mean=55, sd=5), rnorm(200, mean=65, sd=5)))
    )
    
    library(ggplot2)
    
    gg <- ggplot(df, aes(x=weight)) + geom_histogram()
    
    gg_build <- ggplot_build(gg)
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    bin_width <- gg_build$data[[1]]$xmax - gg_build$data[[1]]$xmin
    unique(bin_width)
    #> [1] 1.344828 1.344828 1.344828
    
    diff(range(df$weight))/30
    #> [1] 1.3
    
    gg_data <- layer_data(gg)
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    unique(gg_data$xmax - gg_data$xmin)
    #> [1] 1.344828 1.344828 1.344828
    
    bw <- unique(gg_data$xmax - gg_data$xmin)[1]
    bw
    #> [1] 1.344828
    

    Created on 2022-02-15 by the reprex package (v2.0.1)