The psych
package contains a function ?alpha
which calculates test reliability and some item statistics. When fed raw data (a data.frame with binary values for correct/incorrect answers), it returns, among other things, mean and st. dev. for each item. However, sometimes it doesn't, and only provides item-whole correlation.
Why is that?
The docu states that mean and sd are only calculated "For data matrices [...]", and that x
is "A data.frame or matrix of data, or a covariance or correlation matrix". But how does it know whether I'm giving it raw data or a correlation matrix?
The alpha
command of psych
detects if the input x
is a correlation matrix using the isCorrelation
command.
Let consider the dataset given in this tutorial:
datafilename <- "http://personality-project.org/R/datasets/extraversion.items.txt"
items <- read.table(datafilename,header=TRUE)
df <- with(items, data.frame(q_262 ,q_1480 ,q_819 ,q_1180 ,q_1742 ))
The object df
is a data.frame
whose columns represent the responses to five items. The command isCorrelation
(used inside alpha
) correctly detect that this is not a correlation matrix:
library(psych)
isCorrelation(df)
[1] FALSE
If we calculate the correlation matrix of df
and pass to isCorrelation
, we get again a correct answer:
mtx <- cor(df)
isCorrelation(mtx)
[1] TRUE
Looking inside isCorrelation
one can see that a correlation matrix is defined as an object which is not a data.frame
(hence, is a matrix
) and which is symmetric:
isCorrelation <- function (x)
{
return(!is.data.frame(x) && isSymmetric(unname(x)))
}