Search code examples
rhistogram

How to generate cumulative relative frequency histogram with polygon in R base


I am trying to generate cumulative relative frequency histogram with polygon in R base that look like this: cumulative relative frequency with polygon

These are my data (they are different from those of the picture above):

df <- data.frame(foo = c(15, 23, 12, 10, 28, 20, 12, 17, 20, 
                         21, 18, 13, 11, 12, 26, 30,  6, 16,
                         19, 22, 14, 17, 21, 28,  9, 16, 13,
                         11, 16))

I can draw a basic histogram using

hist(df$foo, col="pink")

And I can draw a histogram of cumulative absolute frequency histogram with a polygon like this:

# save to object
h <- hist(df$Patienten, plot=FALSE)

# replace with cumulative data
h$counts <- cumsum(h$counts)

### plot
plot(h, col="hotpink")
# add Polygon
lines(c(0, h$mids), c(0,h$counts), col="blue") # add type="s" if you like

But how would I do that with relative and cumulative frequencies, just as the screenshot above shows?

(the screenshot is from https://github.com/asalber/statistics-manual p.20)


Solution

  • To have relative frequencies divide the cumulative counts by the total counts. And use breaks for the x axis.

    df <- data.frame(foo = c(15, 23, 12, 10, 28, 20, 12, 17, 20, 
                             21, 18, 13, 11, 12, 26, 30,  6, 16,
                             19, 22, 14, 17, 21, 28,  9, 16, 13,
                             11, 16))
    
    h <- hist(df$foo, plot=FALSE)
    # cumulative density
    cumsum(h$counts)/sum(h$counts)
    #> [1] 0.1034483 0.4137931 0.7241379 0.8620690 1.0000000
    h$counts <- cumsum(h$counts)/sum(h$counts)
    ### plot
    plot(h, col="hotpink", ylab = "Relative Frequency")
    # add Polygon
    lines(h$breaks, c(0, h$counts), col = "blue") # add type="s" if you like
    

    Created on 2024-06-08 with reprex v2.1.0