Search code examples
rpearson

Outputting N and significance for Pearson Correlation in R


I've got 400 store departments and I'm running (Pearson) correlations between all the departments. How can I output the 'N' (number of cases) and the significance level (p value)?

I'm using the cor function. Here is my current code which works fine:

numprod <- ncol(data) - 2; 
matrix <- as.matrix(data[ ,2:numprod]);
AllChannels <- cbind(matrix(nrow = numprod-1,"All channels"),cor(matrix, use="all.obs", method="pearson"));

In SPSS, when you run a correlation it outputs the correlation coefficient, N and significance. This is my desired result.

Thanks all!

Lucas


Solution

  • If it's just the length of one of the vectors then use length. If you want the inferential calculations for the correlation coefficient equaling 0 then use cor.test (as the help page for ?cor tells you.) If it's the number of degrees of freedom for the test then look more closely at ?cor.test.

    > cor.test(1:10,2:11)
    
        Pearson's product-moment correlation
    
    data:  1:10 and 2:11 
    t = 134217728, df = 8, p-value < 2.2e-16
    alternative hypothesis: true correlation is not equal to 0 
    95 percent confidence interval:
     1 1 
    sample estimates:
    cor 
      1 
    

    The result of cor.test will be a list, so it's not going to be useful to use cbind. The Hmisc package has rcorr:

    install.packages("Hmisc")
    library(Hmisc)
    x <- c(-2, -1, 0, 1, 2)
    y <- c(4,   1, 0, 1, 4)
    z <- c(1,   2, 3, 4, NA)
    v <- c(1,   2, 3, 4, 5)
    rcorr(cbind(x,y,z,v))
    #   ========   Returns a list with three elements:
    > rcorr(cbind(x,y,z,v))
      x     y     z v
    x 1  0.00  1.00 1
    y 0  1.00 -0.75 0
    z 1 -0.75  1.00 1
    v 1  0.00  1.00 1
    
    n
      x y z v
    x 5 5 4 5
    y 5 5 4 5
    z 4 4 5 4
    v 5 5 4 5
    
    P
      x      y      z      v     
    x        1.0000 0.0000 0.0000
    y 1.0000        0.2546 1.0000
    z 0.0000 0.2546        0.0000
    v 0.0000 1.0000 0.0000