I've got 400 store departments and I'm running (Pearson) correlations between all the departments. How can I output the 'N' (number of cases) and the significance level (p value)?
I'm using the cor function. Here is my current code which works fine:
numprod <- ncol(data) - 2;
matrix <- as.matrix(data[ ,2:numprod]);
AllChannels <- cbind(matrix(nrow = numprod-1,"All channels"),cor(matrix, use="all.obs", method="pearson"));
In SPSS, when you run a correlation it outputs the correlation coefficient, N and significance. This is my desired result.
Thanks all!
Lucas
If it's just the length of one of the vectors then use length
. If you want the inferential calculations for the correlation coefficient equaling 0 then use cor.test (as the help page for ?cor
tells you.) If it's the number of degrees of freedom for the test then look more closely at ?cor.test
.
> cor.test(1:10,2:11)
Pearson's product-moment correlation
data: 1:10 and 2:11
t = 134217728, df = 8, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
1 1
sample estimates:
cor
1
The result of cor.test will be a list, so it's not going to be useful to use cbind
. The Hmisc package has rcorr
:
install.packages("Hmisc")
library(Hmisc)
x <- c(-2, -1, 0, 1, 2)
y <- c(4, 1, 0, 1, 4)
z <- c(1, 2, 3, 4, NA)
v <- c(1, 2, 3, 4, 5)
rcorr(cbind(x,y,z,v))
# ======== Returns a list with three elements:
> rcorr(cbind(x,y,z,v))
x y z v
x 1 0.00 1.00 1
y 0 1.00 -0.75 0
z 1 -0.75 1.00 1
v 1 0.00 1.00 1
n
x y z v
x 5 5 4 5
y 5 5 4 5
z 4 4 5 4
v 5 5 4 5
P
x y z v
x 1.0000 0.0000 0.0000
y 1.0000 0.2546 1.0000
z 0.0000 0.2546 0.0000
v 0.0000 1.0000 0.0000