Search code examples
rmongodbstatisticsrjson

Calculate sum of counts per min from data frame in R


I've been trying to figure this out for a while, but haven't been able to do so. I found a lot of similar questions which didn't help at all.

I have around 43000 records in data frame in R. The date column is in the format "2011-11-15 02:00:01", and the other column is the count. The structure of the data frame:

str(results)
'data.frame':   43070 obs. of  2 variables:
 $ dates: Factor w/ 43070 levels "2011-11-15 02:00:01",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ count: num  1 2 1 1 1 1 2 3 1 2 ...

How can I get the total count per min?

And I also want to convert the results data frame into json. I used rjson package which converted the entire data frame as a single json element. When I inserted into mongodb, there was only on _id for all 43000 records. What did I do wrong?


Solution

  • You can use the xts package to get the counts/minute quite easily.

    install.packages("xts")
    require("xts")
    results_xts <- xts(results$count, order.by = as.POSIXlt(results$dates))
    

    This converts your dataframe to an xts object. There are a bunch of functions (apply.daily, apply.yearly, etc) in xts that apply functions to different time frames, but there isn't one for by minute. Fortunately the code for those functions is super simple, so just run

    ep <- endpoints(results_xts, "minutes")
    period.apply(results_xts, ep, FUN = sum)
    

    Sorry, I don't know the answer to your other question.