I'm new to R and want to utilize it to directly work with my data. My ultimate goal is to make a histogram / bar plot.
Depth: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
Percent: .4, .1, .5, .2, .1, .3, .9, .3, .2, .2, .8
I want to take the Depth vector and bin it into unequal chunks (0, 1-5, 6-8, 9-10), and take the Percent values and somehow sum them together for the matching chunks.
For example:
0 -> .4
1-5 -> 1.2
6-8 -> 1.4
9-10 -> 1.0
The actual data set goes into the thousands, and I feel R might be more suited for this then using C++ to group my data into a smaller table before letting R plot it.
I looked up how to use SPLIT and CUT, but I'm not quite sure how to utilize the data after I do cut it into ranges. If I do "breaks" for a CUT, I don't know how to include the Zero initial value (corresponding to .4 in the example).
Any suggestions or approaches would be appreciated.
You're on the right track with cut
:
dat <- data.frame(Depth = 0:10,
Percent = c(0.4, 0.1, 0.5, 0.2, 0.1, 0.3, 0.9, 0.3, 0.2, 0.2, 0.8))
cuts <- cut(dat$Depth, breaks=c(0, 1, 6, 9, 11), right=FALSE)
Then you can use aggregate
:
aggregate(dat$Percent, list(cuts), sum)
Or as a oneliner:
aggregate(dat$Percent,
list(cut(dat$Depth,
breaks=c(0, 1, 6, 9, 11),
right=FALSE)),
sum)