Search code examples
rtidyversedplyrsummarize

How can I pull a group-based vector to pass to a function within dplyr's summarize or mutate?


I am trying to create a summary table of accuracy, sensitivity, and specificity using the AUC function within the psych package. I would like to define the input vector (t, a 4 x 1 vector) for each level of the grouped variable.

What I have tried seems to ignore the grouping.

Example:

library(tidyverse)
library(psych)

Data <- data.frame(Class = c("A","B","C","D"),
                   TP = c(198,185,221,192),
                   FP = c(1,1,6,1),
                   FN = c(42,55,19,48),
                   TN = c(569,570,564,569))

Data %>% 
  group_by(Class) %>%
  mutate(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)

This gives me close to the correct output, except the values for Accuracy, Sensitivity, and Specificity are only being calculated with the first row, then repeated:

# A tibble: 4 x 8
# Groups:   Class [4]
  Class    TP    FP    FN    TN Accuracy Sensitivity Specificity
  <fct> <dbl> <dbl> <dbl> <dbl>    <dbl>       <dbl>       <dbl>
1 A       198     1    42   569    0.947       0.995       0.931
2 B       185     0    55   570    0.947       0.995       0.931
3 C       221     6    19   564    0.947       0.995       0.931
4 D       192     1    48   569    0.947       0.995       0.931

I have also tried with summarize:

Data %>% 
  group_by(Class) %>%
  summarize(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)

But the output is the same as above.

The desired output is a unique calculation for each level of "Class"

# A tibble: 4 x 8
  Class    TP    FP    FN    TN Accuracy Sensitivity Specificity
  <fct> <dbl> <dbl> <dbl> <dbl>    <dbl>       <dbl>       <dbl>
1 A       198     1    42   569     0.95        0.99        0.93
2 B       185     0    55   570     0.93        0.99        0.91
3 C       221     6    19   564     0.97        0.97        0.97
4 D       192     1    48   569     0.94        0.99        0.92

How do I get the function call within summarize or mutate to maintain the groups?


Solution

  • This works

    Data %>% 
      group_by(Class) %>%
      mutate(Accuracy = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Accuracy,
             Sensitivity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Sensitivity,
             Specificity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Specificity)
    

    but maybe this is more clear

    Data %>% 
      group_by(Class) %>%
      mutate(Accuracy = AUC(t = c(TP, FP, FN, TN))$Accuracy,
             Sensitivity = AUC(t = c(TP, FP, FN, TN))$Sensitivity,
             Specificity = AUC(t = c(TP, FP, FN, TN))$Specificity)