I am trying to calculate the ROC of a target variable that is binary(0,1) versus a decision tree prediction.
When I set the prediction value to be binary, it gives me the following error:
> roc(as.numeric(pred),as.numeric(data$target))
Setting levels: control = 0, case = 1
Setting direction: controls < cases
When I set the prediction value to be a probability, it gives me the following error:
> roc(pred[,2],as.numeric(data$target))
'response' has more than two levels. Consider setting 'levels'
explicitly or using 'multiclass.roc' insteadSetting levels:
control = 0.166666666666667, case = 0.232876712328767
Setting direction: controls < cases
So I am confused about what format should I set to the prediction to so that the ROC is calculated correctly? Why is my function showing these errors?
If you look at pROC's roc
function documentation, you will see that the formal definition has the following form:
## Default S3 method:
roc(response, predictor, [...]
The prediction is therefore the second argument, not the first as you are using. Therefore your call should look like:
roc(data$target, pred[,2])
If you forget the order you can always use named argument in order to ignore the order:
roc(predictor = pred[,2], response = data$target)
Also note it is not necessary and even not recommended to convert the response to a numeric vector, so I removed as.numeric
from the calls above.