Search code examples
rweightedweighted-average

calculate mean and variance for weighted discrete random variables in R


I have the following data frame:

dat <- read.table(text="  X prob
1 1  0.1
2 2  0.2
3 3  0.4
4 4  0.3", header=TRUE)

Is there any built-in function or elegant way to calulate mean and variance for discrete random variables in R?


Solution

  • There is a weighted.mean function in base R and the Hmisc package has a bunch of wtd.* functions.

    > with(dat, weighted.mean(X, prob))
    [1] 2.9
    
    require(Hmisc)
    >  wtd.var(x=dat$X, weights=dat$prob)
    [1] Inf
    # Huh ?  On investigation the weights argument is suppsed to be replicate weights
    # So it's more appropriate to use normwt=TRUE
    > wtd.var(x=dat$X, weights=dat$prob, normwt=TRUE)
    [1] 1.186667
    

    The survey package from Thomas Lumley provides much more than this simplistic example illustrates. It has the mechanism for handling complex weighting schemes for a variety of statistical modeling procedures:

    require(survey)
    > dclus1<-svydesign(id=~1, weights=~prob, data=dat)
    >   v<-svyvar(~X, dclus1)
    > v
      variance     SE
    X   1.1867 0.7011
    

    These are sample statistics rather than the variances that would be calculated for abstract random variables. This result does seem appropriate for a statistical system, but might not be the correct answer for a probability homework question.