I have a dataframe (gdata) with x (as "r") and y (as "km") coordinates of a function. When I plot it like this:
plot(x = gdata$r, y = gdata$km, type = "l")
I get the graph of the function:
Now I want to calculate the area under the curve from x = 0 to x = 0.6. When I look for appropriate packages I only find something like calculation AUC of a ROC curve. But is there a way just to calculate the AUC of a normal function?
The area under the curve (AUC) of a given set of data points can be archived using numeric integration:
Let data
be your data frame containing x and y values. You can get the area under the curve from lower x0=0 to upper x1=0.6 by integrating the function, which is linearly approximating your data.
This is a numeric approximation and not exact, because we do not have an infinite number of data points: For y=sqrt(x)
we will get 0.3033 instead of true value of 0.3098. For 200 rows in data
we'll get even better with auc=0.3096.
library(tidyverse)
data <-
tibble(
x = seq(0, 2, length.out = 20)
) %>%
mutate(y = sqrt(x))
data
#> # A tibble: 20 × 2
#> x y
#> <dbl> <dbl>
#> 1 0 0
#> 2 0.105 0.324
#> 3 0.211 0.459
#> 4 0.316 0.562
#> 5 0.421 0.649
#> 6 0.526 0.725
#> 7 0.632 0.795
#> 8 0.737 0.858
#> 9 0.842 0.918
#> 10 0.947 0.973
#> 11 1.05 1.03
#> 12 1.16 1.08
#> 13 1.26 1.12
#> 14 1.37 1.17
#> 15 1.47 1.21
#> 16 1.58 1.26
#> 17 1.68 1.30
#> 18 1.79 1.34
#> 19 1.89 1.38
#> 20 2 1.41
qplot(x, y, data = data)
integrate(approxfun(data$x, data$y), 0, 0.6)
#> 0.3033307 with absolute error < 8.8e-05
Created on 2021-10-03 by the reprex package (v2.0.1)
The absolute error returned by integrate
is corerect, iff the real world between every two data points is a perfect linear interpolation, as we assumed.