I have a dataset (alldata) with a time variable for X in hours (Time) and a numeric measurement for Y (Value) for a number of patients with a certain PID (allPID = vector with all PIDs). For the first 24 hours, I want to calculate the area under the curve. First, I used the following script:
AUC1 <- as.data.frame(allPID)
for(i in allPID) {
x <- alldata[alldata$PID == i & alldata$Time <= 24, "Time"]
y <- alldata[alldata$PID == i & alldata$Time <= 24, "Value"]
AUC1$AUC24trap[AUC1$allPID == i] <- AUC(x, y,
method = "trapezoid",
na.rm = TRUE)
}
However, this script only provided an AUC for 17 of 46 cases. Although I am not completely sure what the exact problem was with this script, the solution seemed to be to first bind x and y in a dataframe and use only complete cases.
AUC2 <- as.data.frame(allPID)
for(i in allPID) {
x24 <- alldata[alldata$PID == i & alldata$Time <= 24, "Time"]
y24 <- alldata[alldata$PID == i & alldata$Time <= 24, "Value"]
df24 <- cbind(x24,y24)
df24 <- as.data.frame(df24[complete.cases(df24), ])
AUC2$AUC24[AUC2$allPID == i] <- AUC(df24$x24, df24$y24,
method = "trapezoid", na.rm = T
)
}
I figured that since I use 'complete.cases' (and there are indeed no NAs in the df24), I could set na.rm = F. BUT: this provides completely different results than if I use na.rm = T.
Leaving the question: why are these results so different? What is is that the na.rm does in this case?
Hopefully someone can help out!
This was a bug in the NA-handling of AUC(). It has been fixed in DescTools 0.99.44.