Search code examples
rprobability-densityauc

Convert a vector to density vector in R


I have a vector

v = [..., -10, -10, -10, ..., 1, 2, 5, 6, 7, 9, ...]

The geom_density plots the histogram of this vector in a smooth fashion, like a density function!

How can I use the auc, area under the curve, function of library MESS, to compute the areas under the curve for the density plot of such vector in a given interval, let say (-1, 3)?


Solution

  • "The geom_density plots the histogram of this vector in a smooth fashion, like a density function!" Well, that's because geom_density performs a kernel density estimation! So it's not "like a density function", it is a density function.

    Under the hood of geom_density it is actually stats::density that performs the density estimation. The kernel density estimates are given such that they define a proper probability density function with unit area under the curve.

    We can confirm that by

    x <- rnorm(100)
    dens <- density(x)
    df <- data.frame(x = dens$x, y = dens$y)
    sum(df$y) * diff(df$x)[1]
    #[1] 1.000952
    

    Close enough.

    It's straight-forward to integrate the density function over a specific range by summing the corresponding values in df; since you don't provide sample data I leave that up to you.