Search code examples
rautomationaverage

R Taking an average of a variable over intervals of another numeric variable


How would I go about calculating an average of column 2 for every x interval in column 1, when the number of rows for the intervals are not always equal?

It seems very simple but I'm not sure where to start.

df <- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5), 
            value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))

Let's say I want to know the block average of column 2 when column 1 goes from 0 - 0.5 then 0.5 - 1 and 1 - 1.5 and so on, but if 0 - 0.5 are 5 rows and 0.5 - 1 are 9 rows, what is the best way to do this without having to specify the row numbers?

I have tried searching but perhaps I'm not using the right key words.


Solution

  • Using aggregate in base R

     aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)
    

    -output

      grp    value
    1 (0,0.5]  47.6600
    2 (0.5,1]  83.2375
    3 (1,1.5] 112.0500