I understand how to get the density values from this data, for example the density 0.69 is obtained from counts/bin width = 3448:0.5*10000 = 0.6896, right?
set.seed(1234)
h <- hist(rbinom(10000, 10, 0.1), freq=FALSE)
str(h)
#List of 6
# $ breaks : num [1:11] 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 ...
# $ counts : int [1:10] 3448 3930 0 1910 0 588 0 112 0 12
# $ density : num [1:10] 0.69 0.786 0 0.382 0 ...
# $ mids : num [1:10] 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
# $ xname : chr "rbinom(10000, 10, 0.1)"
# $ equidist: logi TRUE
# - attr(*, "class")= chr "histogram"
However, using the built-in data in R called airquality$Temp, I got
Temperature <- airquality$Temp
h = hist(Temperature)
str(h)
List of 6
$ breaks : int [1:10] 55 60 65 70 75 80 85 90 95 100
$ counts : int [1:9] 8 10 15 19 33 34 20 12 2
$ density : num [1:9] 0.0105 0.0131 0.0196 0.0248 0.0431 ...
$ mids : num [1:9] 57.5 62.5 67.5 72.5 77.5 82.5 87.5 92.5 97.5
$ xname : chr "Temperature"
$ equidist: logi TRUE
- attr(*, "class")= chr "histogram"
and by doing the same way as before, for example, counts/class width = 8:5 = 1.6 instead of 0.0105. My question is how to calculate the density value (0.0105 0.0131 0.0196 0.0248 0.0431 ...) in this histogram?
You need to divide the counts by the total number of observations and the binwidth:
h$counts / nrow(airquality) / 5
#> [1] 0.010457516 0.013071895 0.019607843 0.024836601 0.043137255 0.044444444
#> [7] 0.026143791 0.015686275 0.002614379
We can see this matches density:
h$density
#> [1] 0.010457516 0.013071895 0.019607843 0.024836601 0.043137255 0.044444444
#> [7] 0.026143791 0.015686275 0.002614379
The calculation is the same for your initial example:
3448 / 10000 / 0.5
#> [1] 0.6896