Search code examples
rfunctionloopsauc

Comparing multiple AUCs in parallel (R)


I am using the pROC package in r to calculate and compare the AUCs of multiple tests, to see which test has the best ability to discriminate between patients and controls. However, I have a large number of tests and essentially want to run a series of pairwise comparisons of each tests AUC with every other test and then correct for multiple comparisons. This is as far as I've gotten with my code (example with simulated and replicable dataset below):

#load pROC
library(pROC)

#generate df with random numbers
set.seed(123)
df <- data.frame(disease_status = rbinom(n=100, size=1, prob=0.20),
                 test1 = rnorm(100, mean=15, sd=4),
                 test2 = rnorm(100, mean=30, sd=2),
                 test3 = rnorm(100, mean=50, sd=3))

#create roc object for test1, test2, test3
roc.out_test1<-roc(df$disease_status, df$test1, plot=TRUE, smooth = FALSE)
roc.out_test2<-roc(df$disease_status, df$test2, plot=TRUE, smooth = FALSE)
roc.out_test3<-roc(df$disease_status, df$test3, plot=TRUE, smooth = FALSE)

#compare the AUC of test1 and test 2
roc.test(roc.out_test1, roc.out_test2, reuse.auc=TRUE, method="delong", na.rm=TRUE)

#DeLong's test for two correlated ROC curves
#data:  roc.out_test1 and roc.out_test2
#Z = 0.60071, p-value = 0.548
#alternative hypothesis: true difference in AUC is not equal to 0
#sample estimates:
#AUC of roc1 AUC of roc2 
#0.5840108   0.5216802 

#create a function to do above for all comparisons
vec_ROCs1 <- c("roc.out_test1,", "roc.out_test2,", "roc.out_test3,")
vec_ROCs2 <- c("roc.out_test1", "roc.out_test2", "roc.out_test3")
ROCs2_specifications  <- paste0(vec_ROCs2, ",", "reuse.auc=TRUE")
test <- unlist(lapply(ROCs2_specifications, function(x) paste0(vec_ROCs1, x)))
test2 <- lapply(test, function(x) roc.test(x))

#Error in roc.test.default(x) : 
#  argument "predictor1" is missing, with no default 

Please let me know your thoughts and suggestions on how to fix this!

Thank you.


Solution

  • The following should work, please check it. I didn't write all the details, but you can ask me other questions if you don't understand the code.

    #load pROC
    library(pROC)
    #> Type 'citation("pROC")' for a citation.
    #> 
    #> Attaching package: 'pROC'
    #> The following objects are masked from 'package:stats':
    #> 
    #>     cov, smooth, var
    
    #generate df with random numbers
    set.seed(123)
    df <- data.frame(disease_status = rbinom(n=100, size=1, prob=0.20),
                     test1 = rnorm(100, mean=15, sd=4),
                     test2 = rnorm(100, mean=30, sd=2),
                     test3 = rnorm(100, mean=50, sd=3))
    
    #create roc object for test1, test2, test3
    roc.out_test1<-roc(df$disease_status, df$test1, plot=TRUE, smooth = FALSE)
    #> Setting levels: control = 0, case = 1
    #> Setting direction: controls < cases
    
    roc.out_test2<-roc(df$disease_status, df$test2, plot=TRUE, smooth = FALSE)
    #> Setting levels: control = 0, case = 1
    #> Setting direction: controls < cases
    
    roc.out_test3<-roc(df$disease_status, df$test3, plot=TRUE, smooth = FALSE)
    #> Setting levels: control = 0, case = 1
    #> Setting direction: controls < cases
    
    # compare the AUC of test1 and test 2
    roc.test(roc.out_test1, roc.out_test2, reuse.auc = TRUE, method = "delong", na.rm = TRUE)
    #> 
    #>  DeLong's test for two correlated ROC curves
    #> 
    #> data:  roc.out_test1 and roc.out_test2
    #> Z = 0.60071, p-value = 0.548
    #> alternative hypothesis: true difference in AUC is not equal to 0
    #> sample estimates:
    #> AUC of roc1 AUC of roc2 
    #>   0.5840108   0.5216802
    

    Now we generate a list of all possible combinations of the three tests and run the roc.test function using the same parameters that you set.

    all_tests <- combn(
      list(
        "test1" = roc.out_test1,
        "test2" = roc.out_test2,
        "test3" = roc.out_test3
      ),
      FUN = function(x, ...) roc.test(x[[1]], x[[2]]),
      m = 2,
      simplify = FALSE, 
      reuse.auc = TRUE, 
      method = "delong", 
      na.rm = TRUE
    )
    

    The output is a list of choose(3, 2) = 3 elements (i.e. the number of combinations of n elements taken 2 at a time) and each element of the list is a test. For example this is the same as your previous test:

    all_tests[[1]]
    #> 
    #>  DeLong's test for two correlated ROC curves
    #> 
    #> data:  x[[1]] and x[[2]]
    #> Z = 0.60071, p-value = 0.548
    #> alternative hypothesis: true difference in AUC is not equal to 0
    #> sample estimates:
    #> AUC of roc1 AUC of roc2 
    #>   0.5840108   0.5216802
    

    The only problem here is that it's difficult to recognise which tests are used in the comparisons, so we can also add a list of names:

    tests_names <- combn(
      list("test1", "test2", "test3"), 
      m = 2, 
      FUN = paste, 
      simplify = TRUE, 
      collapse = "_"
    )
    all_tests <- setNames(all_tests, tests_names)
    

    This is the result.

    names(all_tests)
    #> [1] "test1_test2" "test1_test3" "test2_test3"
    

    The names of the objects flag the tests that are used in the comparison.

    all_tests$test1_test2
    #> 
    #>  DeLong's test for two correlated ROC curves
    #> 
    #> data:  x[[1]] and x[[2]]
    #> Z = 0.60071, p-value = 0.548
    #> alternative hypothesis: true difference in AUC is not equal to 0
    #> sample estimates:
    #> AUC of roc1 AUC of roc2 
    #>   0.5840108   0.5216802
    

    Created on 2020-03-14 by the reprex package (v0.3.0)