Search code examples
rhistogramcurve

Histogram with curve in R


I need a histogram for my data, but could not find one with a curve. Can anyone please suggest a histogram showing frequencies (not densitities) with a curve for the data below? Fancy ones are preferred, but no worries if not :)

x <- rnorm(1000)
hist(x)

Solution

  • Here's the slow, step-by-step version.

    This is your data.

    population_mean <- 0
    population_sd <- 1
    n <- 1000
    x <- rnorm(n, population_mean, population_sd)
    

    These are some x coordinates for drawing a curve. Notice the use of qnorm to get lower and upper quantiles from a normal distribution.

    population_x <- seq(
      qnorm(0.001, population_mean, population_sd), 
      qnorm(0.999, population_mean, population_sd), 
      length.out = 1000
    )
    

    In order to convert from density to counts, we need to know the binwidth. This is easiest if we specify it ourselves.

    binwidth <- 0.5
    breaks <- seq(floor(min(x)), ceiling(max(x)), binwidth)
    

    Here's our histogram.

    hist(x, breaks)
    

    The count curve is the normal density times the number of data points divided by the binwidth.

    lines(
      population_x, 
      n * dnorm(population_x, population_mean, population_sd) * binwidth, 
      col = "red"
    )
    

    Let's see that again with the sample distribution rather than the population distribution.

    sample_mean <- mean(x)
    sample_sd <- sd(x)
    sample_x <- seq(
      qnorm(0.001, sample_mean, sample_sd), 
      qnorm(0.999, sample_mean, sample_sd), 
      length.out = 1000
    )
    lines(
      population_x, 
      n * dnorm(sample_x, sample_mean, sample_sd) * binwidth, 
      col = "blue"
    )
    

    histogram with frequency curves