Much of my work revolves around diagnostic tests for tuberculosis. As you might imagine, it's handy to be able to quickly evaluate and validate the outputs of those tests. I wrote a function that does just that, here (pared down for clarity). In short, it takes the numeric results from the test and produces the manufacturer-specified interpretation.
This function works well for me - I've validated it against thousands of tests, and it's fast enough for anything I throw at it. I'd like to bundle it and a couple of similar functions into a package for wider use, however, and I'd like to get some feedback on it before I do so:
The function depends on a great big for loop wrapped around nested if-else functions. It isn't especially elegant and the dread for()
undoubtedly damages my credibility with some (ahem), but it works. Is there a better approach to this? If so, is it sufficiently better to warrant re-writing Code That Works?
The criteria in the above function are for interpretation of the test in North America; the rest of the world follows slightly different standards. I'd like to have those available, as well. I'm considering having a separate, non-exported function for each. The various data checks (excluded from the above gist) would continue to live in the main function, which would then call the specified subfunction. Does that sound reasonable?
Any other suggestions or advice? Style, code organization - anything at all.
I realize I should probably just push this baby bird out of the nest, but I work mostly in a vacuum and so am a bit nervous. Any advice is greatly appreciated.
Edit: in case you missed the link to the gist, this is the function I'm talking about.
As requested, sample test data.
Edited to reflect comments and to validate against test data
You can avoid any type of loop or if
altogether and simply use R vector subscripting:
qft.interp <- function(nil, tb, mitogen, tbnil.cutoff = 0.35){
# Set a tolerance to avoid floating point comparison troubles.
tol <- .Machine$double.eps ^ 0.5
# Set up the results vector
result <- rep(NA, times = length(nil))
result[nil+tol > 8.0] <- "Indeterminate"
result[is.na(result) & (tb-nil+tol > tbnil.cutoff) &
(tb-nil+tol > .25*nil)] <- "Positive"
result[is.na(result) & (tb-nil+tol < tbnil.cutoff | tb-nil+tol < .25*nil) &
!(mitogen-nil+tol < 0.5)] <- "Negative"
result[is.na(result) & ((tb-nil+tol < tbnil.cutoff | tb-nil+tol < .25*nil) &
mitogen-nil+tol < 0.5)] <- "Indeterminate"
result
}
all(with(tests, qft.interp(nil, tb, mitogen)) == tests$interp)
[1] TRUE