Search code examples
rmeanvarianceperiodicity

Calculating the mean and variance of a periodic (circular) variable in R


I have several variables in my dataset that represent daily timing of events across a week.

For example for two rows might look like:

t1 = c(NA,12.6,10.7,11.5,12.5,9.5,14.1)
t2 = c(23.7,1.2,NA,22.9,23.2,0.5,0.1)

I want to calculate the variance of these rows. To do this, I need the mean and because these are periodic variables, I've adapted the code from this page:

#This can all be wrapped in a function like this
circ.mean <- function(m,int,na.rm=T) {

   if(na.rm) m <- m[!is.na(m)]
   rad.m    = m*(360/int)*(pi/180)
   mean.cos = mean(cos(rad.m))
   mean.sin = mean(sin(rad.m))
   x.deg    = atan(mean.sin/mean.cos)*(180/pi)

   return(x.deg/(360/int))
}

This works as expected for t2:

> circ.mean(t2,24)
[1] -0.06803088

although ideally the answer would be 23.93197. But for t1, it gives an incorrect answer:

> circ.mean(t1,24)
[1] -0.1810074

whereas using the normal mean function gives the right answer:

> mean(t1,na.rm=T)
[1] 11.81667

My questions are:

1) Is this "circular mean" code correct and if so, am I using it correctly?

2) I've had a stab my own circ.var function (see below) to calculate the variance of a periodic variable - will this produce the correct variances for all possible input timing vectors?

circ.var <- function(m,int=NULL,na.rm=TRUE) {

if(is.null(int)) stop("Period parameter missing")
if(na.rm) m <- m[!is.na(m)]
if(sum(!is.na(m))==0) return(NA)
n=length(m)
mean.m = circ.mean(m,int)
var.m = 1/(n-1)*sum((((m-mean.m+(int/2))%%int)-(int/2))^2)
return(var.m)

}

Any help would be hugely appreciated! Thanks for taking the time to read this!


Solution

  • I deleted my old answer, as I believe there was a mistake in the solution I provided.

    I've written a series of R scripts that I've made available at my GitHub page which should calculate the mean, variance and other stats.

    Thanks to @Gregor for his help.