Search code examples
raggregatestandard-deviation

existing function to combine standard deviations in R?


I have 4 populations with known means and standard deviations. I would like to know the grand mean and grand sd. The grand mean is obviously simple to calculate, but R has a handy utility function, weighted.mean(). Does a similar function exist for combining standard deviations?

The calculation is not complicated, but an existing function would make my code cleaner and easier to understand.

Bonus question, what tools do you use to search for functions like this? I know it must be out there, but I've done a lot of searching and can't find it. Thanks!


Solution

  • Are the populations non overlapping?

    library(fishmethods)
    combinevar
    

    For instance the example in wikipedia would work like this:

    xbar <- c(70,65)
    s<-c(3,2)
    n <- c(1,1)
    combinevar(xbar,s,n)
    

    and standard deviation would be sqrt(combinevar(xbar,s,n)[2])

    if you don't want to download the library the function goes like this:

    combinevar <- 
    function (xbar = NULL, s_squared = NULL, n = NULL) 
    {
        if (length(xbar) != length(s_squared) | length(xbar) != length(n) | 
            length(s_squared) != length(n)) 
            stop("Vector lengths are different.")
        sum_of_squares <- sum((n - 1) * s_squared + n * xbar^2)
        grand_mean <- sum(n * xbar)/sum(n)
        combined_var <- (sum_of_squares - sum(n) * grand_mean^2)/(sum(n) - 
            1)
        return(c(grand_mean, combined_var))
    }