Search code examples
rperformancenested-listscounting

How to count the number of underspecified models selected by LASSO run on N datasets when you already have the TPR & FPR for each of them


I basically need the equivalent of a SUMIF in excel here because I already have the True Positive Rate aka sensitivity, False Positive Rate, and True Negative Rate aka specificity (because TNR = 1- FPR) for each of the N LASSOs I have ran on a corresponding set of N datasets which look like this:

> head(BM1_TPRs)
[[1]]
[1] 1
[[2]]
[1] 1
[[3]]
[1] 0.6666667
[[4]]
[1] 1

... N

> head(BM1_FPRs)
[[1]]
[1] 0
[[2]]
[1] 0
[[3]]
[1] 0
[[4]]
[1] 0

. . . N

> head(BM1_TNRs)
[[1]]
[1] 1
[[2]]
[1] 1
[[3]]
[1] 1
[[4]]
[1] 1

. . . N

And now, I need functions or a function which can count up how many models selected have at least one omitted variable and no extraneous variables, that is, TPR < 1, and FPR = 0 (or equivalently TRN = 1).

I have already tried the following code:

Under <- lapply(BM1_TPRs, function(i) {if (i < 1) {cnt <- cnt + 1}
  cnt})

But it does not run because it can't find cnt and also this as well, which does run, but it returns the following:

> head(Under)
[[1]]
NULL    
[[2]]
NULL    
[[3]]
[1] 1.666667    
[[4]]
NULL

. . . Which is CLEARLY not what I was looking for!

p.s. I could also really use a function which calculates/counts the total number of correctly specified models selected, that is, those for which TPR = 1 and FPR = 0.


Solution

  • Assuming your functions which created the lists of True Positive Rates, True Negative Rates, and False Positive Rates are all valid, you could use the following to tally up the total number of unspecified models selected:

    # the True Positive Rates as a vector rather than a list
    TPRs <- unlist(BM1_TPRs)
    # the True Negative Rates as a vector rather than a list
    TNRs <- unlist(BM1_TNRs)
    # the False Positive Rates as a vector rather than a list
    FPRs <- unlist(BM1_FPRs)
    

    From there, all you need is:

    Under = sum((TPRs < 1) & (FPRs == 0))
    

    Or, equivalently, because FPR = 1 - TNR, you could also use:

    Under = sum((TPRs < 1) & (TNRs == 1))