This should be a pretty straightforward question, but I can't find the answer anywhere (in part because I'm not sure what to query for).
In R, it's easy to compute the density of:
c(1, 2, 2, 2, 3, 5, 5, 7, 8, 10, 10, 10)
You just do:
density(c(1, 2, 2, 2, 3, 5, 5, 7, 8, 10, 10, 10))
The problem is, if I had an "ungrouped" vector like this for my data, it would be far too large for R (or the query engine that builds the dataset) to handle. So I need to use a GROUP BY
and COUNT(*)
in the initial query to compress my results (as such, using rep()
to expand the counts doesn't help). Given such a data frame of 'counts', how do I then compute the density (for a KDE plot) of a frame like:
Value Count
1 1
2 3
3 1
5 2
7 1
8 1
10 3
And just to be clear, I really do need a density plot, not a histogram.
Just use the weights argument
density(d$Value, weights=d$Count/sum(d$Count))
(edited to account for first comment)