Search code examples
rmatrixcovariance

Failure to reproduce covariance matrix


I am trying to reproduce covariance matrix for the generated data

 set.seed(1)
 datam <- round((matrix((rnorm(300, mean = 2, sd = 0.5)), nrow = 100, ncol = 
 3)), digits = 2)

to do that, I should multiply this matrix's transpose by the original matrix and multiply further by 1/n, where n=100 and transpose is the following

datamT <- t(datam)

so,

datamT%*%datam*1/100 

FAILS to reproduce the covariance matrix of datam, originally determined by

cov(datam) 

I am not sure where the mistake is, so help would be appreciated


Solution

  • You need to subtract the column means for the approximation to be valid.

    x <- datam-rep(colMeans(datam),each=nrow(datam))
    ## or use scale(datam, scale=FALSE)
    
    t(x)%*%x/(nrow(x)-1)
                  [,1]          [,2]         [,3]
    [1,]  0.2015461010 -0.0001247677  0.004172283
    [2,] -0.0001247677  0.2289751111 -0.012216242
    [3,]  0.0041722828 -0.0122162424  0.266885848
    
    t(x)%*%x/(nrow(x)-1)-cov(datam)
                  [,1]          [,2]         [,3]
    [1,] -1.387779e-16  7.047314e-19 0.000000e+00
    [2,]  7.047314e-19 -5.551115e-17 0.000000e+00
    [3,]  0.000000e+00  0.000000e+00 1.110223e-16