Search code examples
rmachine-learningrandom-forest

R Random Forest Model: Issue with Generating Confusion Matrix


When I can trying to generate a confusion matrix for my random forest model I am getting the following error: Error in !all.equal(nrow(data), ncol(data)) : invalid argument type

Here is the code I used:

`ind <- sample(2, nrow(completeData), replace = TRUE, prob = c(0.7, 0.3))
trainData <- completeData[ind == 1, ]
testData <- completeData[ind == 2, ]

rf1 <- randomForest(price~., data = trainData)
print(rf1)

#Random Forest Model
p1 <- predict(rf1, newdata = testData)
testData$p1 <- p1

#Confusion matrix 
confusionMatrix(table(testData$price, testData$p1))`

Solution

  • price and, hence, p1 are probably continuous data, whereas confusionMatrix works with categorical data (factors).

    You could cut your continuous data into categories and run the confusionMatrix like so (adjust breaks according to actual price range):

    confusionMatrix(table(cut(testData$price, breaks = 0:10),
                          cut(testData$p1, breaks = 0:10)
                          )
                    )
    

    ... but why downgrade the continuous data instead of inspecting on the continuous level (scatterplot, correlation etc.)?