Search code examples
rggplot2histogram

Setting the "binwidth" in ggplot histogram with POSIXct / datetime values (geom_histogram)


I'm trying to understand the behaviour of binwith in geom_histogram() when running over POSIXct / datetime values. In the documentation, it says that binwith specifies the width of the bins which can be specified as a numeric value and that the bin width of a date variable is the number of days in each time. So I would expect the following two ggplot commands to produce the same output.

Not only is this not the case, but the second command takes about 5 minutes to run

library(ggplot2)

df <- data.frame(day = as.POSIXct("2018-11-01 10:00:00")+(1:10)*3600*24)


ggplot(df,aes(day)) + 
  geom_histogram(bins = 10,colour = "black",fill = "grey")

ggplot(df,aes(day)) + 
  geom_histogram(binwidth = 1,colour = "black",fill = "grey")

Created on 2018-11-04 by the reprex package (v0.2.0).


Solution

  • I've had the rubber duck experience and found that the with date the documentation meant specifically an vector of the class Date. The behaviour of binwidthwith the class POSIXct is described in the followup sentence: the bin width of a time variable is the number of seconds.

    In short, the solution is multiplying binwidth by 3600*24 to get days instead of seconds.

    ggplot(df,aes(day)) + 
      geom_histogram(binwidth = 1*3600*24,colour = "black",fill = "grey")