I would like to calculate means and st.devs of a column in table but I would like to calculate them for each new observation ex
library(tidyverse)
aa <- data.frame(aa = c(2, 3, 4, 5, 6, 7, 8)) %>%
mutate(aa1 = cumsum(aa), li = 1:n()) %>%
mutate(MeanAA = aa1/li)
aa = c(2, 3, 4, 5, 6, 7, 8)
mean(aa[1:2])
mean(aa[1:3])
sd(aa[1:2])
sd(aa[1:3])
I could do it for a mean but not for SD. I would like to see how sd is changing in relation to mean with increasing number of observations.
How about this:
aa <- c(2, 3, 4, 5, 6, 7, 8)
for (i in 2:length(aa)) {
mn <- mean(aa[1:i])
ss <- sd(aa[1:i])
cat(sprintf("1-%i\tMean: %.2f\tSD: %.2f\n", i, mn, ss))
}
#> 1-2 Mean: 2.50 SD: 0.71
#> 1-3 Mean: 3.00 SD: 1.00
#> 1-4 Mean: 3.50 SD: 1.29
#> 1-5 Mean: 4.00 SD: 1.58
#> 1-6 Mean: 4.50 SD: 1.87
#> 1-7 Mean: 5.00 SD: 2.16
Created on 2022-06-01 by the reprex package (v2.0.1)
If you need the values in a data.frame, you can use it like so
library(tidyverse)
tibble(aa = c(2, 3, 4, 5, 6, 7, 8)) %>%
mutate(
running_mean = sapply(seq(n()), function(i) mean(aa[seq(i)])),
running_sd = sapply(seq(n()), function(i) sd(aa[seq(i)])),
)
#> # A tibble: 7 x 3
#> aa running_mean running_sd
#> <dbl> <dbl> <dbl>
#> 1 2 2 NA
#> 2 3 2.5 0.707
#> 3 4 3 1
#> 4 5 3.5 1.29
#> 5 6 4 1.58
#> 6 7 4.5 1.87
#> 7 8 5 2.16
Created on 2022-06-01 by the reprex package (v2.0.1)