Search code examples
rcorrelation

Is there a way to calculate a Spearman's correlation row by row in a data frame?


I have a large data frame, the first four rows of my data frame look like this:

X1 <- list(c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5), c(1.5, 1.5, 5, 5, 5, 5, 5), c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5), c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5))
X2 <- list(c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5), c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5), c(1.5, 1.5, 5, 5, 5, 5, 5), c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5))

# generating data frame
all_comb <- data.frame(X1 = I(X1), X2 = I(X2))

I want to calculate Spearman's rank correlation coefficient for each row, something like this:

  1. cor.test(c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5),c(1.5, 1.5, 5, 5, 5, 5, 5), method = "spearman")$estimate
  2. cor.test(c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5),c(1, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5),method = "spearman")$estimate ...

Then I want to store all Spearman's rhos that I get per row in a new column in the dataset.

I tried the following, but I get the error message that x must be a numeric vector:

b = apply(all_comb, 1, function(x) {
  cor.test(all_comb, x, method = "spearman")
})

Solution

  • In your data frame, the elements are themselves vectors. When you compute the correlation, I assume what you want is to compute the correlation between the vector in column 1, and the vector in column 2. So you need to do this:

    b = apply(all_comb, 1, function(x) {
      cor.test(x[[1]], x[[2]], method = "spearman")
    })
    

    You probably want to tidy these a bit in which case you can do:

    b = apply(all_comb, 1, function(x) {
      cor.test(x[[1]], x[[2]], method = "spearman") |> broom::tidy()
    }) 
    
    do.call(rbind, b)