Search code examples
modelingglmroc

How I can plot multiple roc together?


I want to find some good predictors (genes). This is my data, log transformed RNA-seq:

          TRG    CDK6 EGFR  KIF2C CDC20
Sample 1  TRG12  11.39 10.62  9.75 10.34
Sample 2  TRG12  10.16  8.63  8.68  9.08
Sample 3  TRG12   9.29 10.24  9.89 10.11
Sample 4  TRG45  11.53  9.22  9.35  9.13
Sample 5  TRG45   8.35 10.62 10.25 10.01
Sample 6  TRG45  11.71 10.43  8.87  9.44

I have calculated confusion matrix for different models like below

1- I tested each of 23 genes individually in this code and each of them gives p-value < 0.05 remained as a good predictor; For example for CDK6 I have done

glm=glm(TRG ~ CDK6, data = df, family = binomial(link = 'logit'))

Finally I obtained five genes and I put them in this model:

final <- glm(TRG ~ CDK6 + CXCL8 + IL6 + ISG15 + PTGS2 , data = df, family = binomial(link = 'logit'))

I want a plot like this for ROC curve of each model but I don't know how to do that

enter image description here

Any help please?


Solution

  • I will give you an answer using the pROC package. Disclaimer: I am the author and maintiner of the package. There are alternative ways to do it.

    The plot your are seeing was probably generated by the ggroc function of pROC. In order to generate such a plot from glm models, you need to 1) use the predict function to generate the predictions, 2) generate the roc curves and store them in a list, preferably named to get a legend automatically, and 3) call ggroc.

    glm.cdk6 <- glm(TRG ~ CDK6, data = df, family = binomial(link = 'logit'))
    final <- glm(TRG ~ CDK6 + CXCL8 + IL6 + ISG15 + PTGS2 , data = df, family = binomial(link = 'logit'))
    
    rocs <- list()
    
    library(pROC)
    rocs[["CDK6"]] <- roc(df$TRG, predict(glm.cdk6))
    rocs[["final"]] <- roc(df$TRG, predict(final))
    
    ggroc(rocs)