Search code examples
rpsych

R psych tetrachoric - dichotomic variables


I have a dataframe of dichotomic variables corresponding to items of a personality questionnaire. Here are the first lines.

  head(mixclinic)
  # A tibble: 6 x 15
    CMS_1 CMS_2 CMS_3 CMS_4 CMS_5 CMS_6 CMS_7 CMS_8 CMS_9 CMS_10 CMS_11
    <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct>  <fct> 
    1 1     1     0     1     0     1     0     0     0     0      0     
    2 1     1     0     1     0     0     0     1     0     0      0     
    3 1     1     0     1     0     1     0     0     0     0      0     
    4 0     1     0     1     0     1     0     1     0     0      0     
    5 0     1     0     1     0     1     0     0     0     0      0     
    6 1     1     0     1     1     1     0     0     0     0      0 

I would like to perform tetrachoric correlation in order to find the factors explaining the greatest part of the variability. Searching R-based resources, I came across the 'psych' package which has the function tetrachoric. I read the documentation but, nonetheless, I could not perform the analysis. There seems to be a lack of tutorial to help out. Could anyone help or refer to useful sources? Thanks


Solution

  • It may be that the function does not handle factors well when a dataframe is entered as the argument (perhaps if you switched them all to numeric). However, it takes a matrix as an argument so this worked for the data set I created. In the future, it is always helpful to include a reproducible example. Hope this helps!

    Edit: to clarify. I think the issue is that your dataset consisted of factors. The function does not seem to work when the variables are factors. It will work if the variables are numeric or if the data entered is a matrix. So, however you choose to convert your dataframe columns to numeric, or dataframe to a matrix, will work (i.e., the df_matrix <- data.matrix(df) line from my code converted the dataframe to a matrix). Let me know if you have any questions.

    > # Creating your dataset
    > 
    > library(tidyverse)
    > library(psych)
    > 
    > df <- data.frame(CMS_1 = sample(2, replace = T, size = 10)-1,
    +                  CMS_2 = sample(2, replace = T, size = 10)-1,
    +                  CMS_3 = sample(2, replace = T, size = 10)-1,
    +                  CMS_4 = sample(2, replace = T, size = 10)-1,
    +                  CMS_5 = sample(2, replace = T, size = 10)-1,
    +                  CMS_6 = sample(2, replace = T, size = 10)-1,
    +                  CMS_7 = sample(2, replace = T, size = 10)-1,
    +                  CMS_8 = sample(2, replace = T, size = 10)-1)
    > 
    > df <- df %>% mutate_if(is.numeric, as.factor)
    > str(df)
    'data.frame':   10 obs. of  8 variables:
     $ CMS_1: Factor w/ 2 levels "0","1": 1 2 1 2 2 2 1 2 2 2
     $ CMS_2: Factor w/ 2 levels "0","1": 1 2 2 1 2 2 1 2 1 2
     $ CMS_3: Factor w/ 2 levels "0","1": 1 1 2 2 1 1 1 1 1 1
     $ CMS_4: Factor w/ 2 levels "0","1": 2 2 1 2 1 1 2 1 1 2
     $ CMS_5: Factor w/ 2 levels "0","1": 2 1 1 2 2 2 1 2 1 2
     $ CMS_6: Factor w/ 2 levels "0","1": 2 2 1 1 2 2 2 2 1 2
     $ CMS_7: Factor w/ 2 levels "0","1": 2 1 2 1 1 2 1 1 1 2
     $ CMS_8: Factor w/ 2 levels "0","1": 1 2 2 1 1 2 1 1 1 1
    > 
    > # Covnerting your data.frame to a matrix
    > df_matrix <- data.matrix(df)
    > 
    > 
    > tetrachoric(df_matrix)
    For i = 6 j = 3  A cell entry of 0 was replaced with correct =  0.5.  Check your data!
    For i = 8 j = 2  A cell entry of 0 was replaced with correct =  0.5.  Check your data!
    
    Call: tetrachoric(x = df_matrix)
    tetrachoric correlation 
          CMS_1 CMS_2 CMS_3 CMS_4 CMS_5 CMS_6 CMS_7 CMS_8
    CMS_1  1.00                                          
    CMS_2  0.47  1.00                                    
    CMS_3 -0.31 -0.21  1.00                              
    CMS_4 -0.37 -0.54 -0.02  1.00                        
    CMS_5  0.43  0.27 -0.22  0.02  1.00                  
    CMS_6  0.14  0.45 -0.74  0.29  0.44  1.00            
    CMS_7 -0.44  0.34  0.22 -0.02  0.29  0.20  1.00      
    CMS_8 -0.13  0.58  0.33 -0.33 -0.44 -0.10  0.46  1.00
    
     with tau of 
    CMS_1 CMS_2 CMS_3 CMS_4 CMS_5 CMS_6 CMS_7 CMS_8 
    -0.52 -0.25  0.84  0.00 -0.25 -0.52  0.25  0.52 
    Warning message:
    In cor.smooth(mat) : Matrix was not positive definite, smoothing was done