Search code examples
rlagdplyr

A number divided by itself doesn't equal 1?


I'm not sure what I'm missing here:

library(dplyr)
df1<-data.frame(n=c(1,1,1,2,1,1,2))
mutate(df1,foo=n/mean(c(n,lag(n)),na.rm=TRUE))
  n    foo
1 1 0.8125
2 1 0.8125
3 1 0.8125
4 2 1.6250
5 1 0.8125
6 1 0.8125
7 2 1.6250

What on earth is going on? The first row should be, basically, 1/mean(1), i.e., '1'. Why am I getting 0.8125? What's even stranger is in my original dataset, I'm getting yet another number - 0.608, for basically the same calculation. What am I missing?


Solution

  • Try summarise(df1, length(c(n,lag(n)))) — the length of the vector c(n,lag(n)) is the same as two times the number of rows and has mean 1.230769.

    What I believe you want to do is:

    mutate(df1,foo=n/rowMeans(cbind(n,lag(n)),na.rm=TRUE))
    
      n       foo
    1 1 1.0000000
    2 1 1.0000000
    3 1 1.0000000
    4 2 1.3333333
    5 1 0.6666667
    6 1 1.0000000
    7 2 1.3333333