Search code examples

Creating Histogram with Binning Averages

I am making two histograms by using moving averages and binning. I have gotten a moving average of 18k data points , most that are 0 values ,by using excel.

This is what I was looking to get done through R

"Moving Average"

I would like to use R in order to make a script that will produce a histogram of how many ‘counts’ a device received. I have tried :

hist(y, 20)  
hist (y, ) 
plot (y, x) 

and now , after three days of learning this is what I got :

y <- AltWithAllCounts$
x <- AltWithAllCounts$Alt 
barwidth <- 100 
#how many bins
block <- rep(seq(1,length(x)/barwidth),each=barwidth)
#makes bins
a <- aggregate(y,by=list(block),sum) 
#creates sum of bins
altmean <- aggregate(x,by=list(block),mean)
#finds mean altitude of each bin
avgCount <- a$x/barwidth
#averages out each bin
# creates scatterplot of mean bins
 avgBinCnt <- data.frame(altmean$x,a$x)

The idea is that I want to the average sum of 20 values and plot it over time , which is x

x       y
851304  0
851404  0
851503  0
851603  1
851703  0
851804  0
851904  0
852107  0
852203  0
852303  0
922503  0
922603  2
922703  0
922804  0
922904  0
923107  0
923203  0
923303  0
923404  0
923504  0
923604  0
923703  0
923803  0
923904  0
924108  0
924205  1
1441603 0
1441703 0
1441804 0
1441904 0
1442107 1
1442203 1
1442304 0
1442404 4
1442504 0
1442605 1
1442703 6
1442803 8
1442904 0 


  • A histogram shows frequencies, rather than numbers of occurrences in an interval. To get the latter, one can do something like this:

    # First create some test data
    t <- seq(1,20000)
    p <- 2000
    s <- (sin(t*pi/p)+1)/2
    d <- ifelse(runif(length(s))<s,1,0)
    # Each element of d now contains a 1 or a 0, with a probability that varies
    # according to the sign function
    # Choose how many elements to count over
    barwidth <- 100
    # Create a vector of block numbers, with each numbered block having a length of 
    # barwidth
    block <- rep(seq(1,length(s)/barwidth),each=barwidth)
    # Now we aggregate with the sum to find the number of 1s in each block
    a <- aggregate(d,by=list(block),sum)
    # And plot it to show that we have the expected result

    ... which gives:

    enter image description here

    For a scatter plot of frequency rather than a bar chart of counts, this gives the desired output:

    midpoint <- aggregate(t,by=list(block),mean)

    Or a symmetrical running average can be found and plotted with:

    filt <- rep(1/barwidth,barwidth)
    y_sym <- filter(d, filt, sides=2)