Search code examples
rcorrelationlikert

Spearman correalation in R between different categories on likert scale data


i have a data set with 2 columns.One column is the answers in a likert ordinal Scale and the second column is the category of each question.

 df
# A tibble: 50 × 2
   answers           Cat  
   <fct>             <chr>
 1 Very Satisfied    A    
 2 Satisfied         A    
 3 Very Satisfied    B    
 4 Average           B    
 5 Dissatisfied      C    
 6 Average           A    
 7 Very Satisfied    A    
 8 Very Satisfied    B    
 9 Satisfied         B    
10 Very Dissatisfied C    

i want to calculate the Spearman Correlation matrix between each category A,B and C. So the resulted matrix must be a 3 times 3 matrix with the main diagonal to be 1. Because the data are likert scaled i choose the Spearman correlation .

How can i do that in R ?

 structure(list(answers = structure(c(5L, 1L, 4L, 5L, 4L, 2L, 
5L, 2L, 5L, 2L, 5L, 5L, 3L, 3L, 5L, 4L, 5L, 5L, 2L, 1L, 4L, 3L, 
4L, 2L, 5L, 5L, 5L, 5L, 3L, NA, 4L, 1L, 4L, 4L, 2L, 2L, 2L, 1L, 
4L, 3L, 5L, 2L, 1L, 3L, 1L, 5L, 1L, 1L, 4L, 3L, 4L, 2L, 5L, 3L, 
1L, 2L, 4L, 5L, 1L, NA), levels = c("Very Dissatisfied", "Dissatisfied", 
"Average", "Satisfied", "Very Satisfied"), class = "factor"), 
    Cat = c("A", "A", "B", "B", "C", "C", "A", "A", "B", "B", 
    "C", "C", "A", "A", "B", "B", "C", "C", "A", "A", "B", "B", 
    "C", "C", "A", "A", "B", "B", "C", "C", "A", "A", "B", "B", 
    "C", "C", "A", "A", "B", "B", "C", "C", "A", "A", "B", "B", 
    "C", "C", "A", "A", "B", "B", "C", "C", "A", "A", "B", "B", 
    "C", "C")), row.names = c(NA, -60L), class = c("tbl_df", 
"tbl", "data.frame"))

Solution

  • Your answers are factors, not numeric, so you cannot compute the correlations, unless you convert them to numeric. You also didn't specify the question ID. Assuming that the ordering for each category matches the same question, you can use the following code:

    library(tidyr); library(dplyr)
    
    df2 <- mutate(df, answers=as.numeric(answers),
                  qid=row_number(), .by=Cat) |>
      pivot_wider(names_from=Cat, values_from=answers)
    
    cor(df2[,-1], method = "spearman", use="pairwise.complete.obs")
    

              A         B         C
    A 1.0000000 0.4439327 0.5145403
    B 0.4439327 1.0000000 0.1429033
    C 0.5145403 0.1429033 1.0000000