Search code examples
rrandom-forestrasterna

Error: Missing data in columns: pop when running random forest regression using the ranger package


I am trying to implement random forest (RF) regression using the ranger package in R, but I am getting this error: Error: Missing data in columns: pop (pop is my independent variable) when running the predict function.

For reference, when using the randomForest package, I can use the na.action = na.omit function to exclude the NA values, but in ranger I can't do this.

library(terra)
s <- rast(system.file("ex/logo.tif", package="terra")) [[1:2]] 
names(s) = c("ntl", "covar")
s[10:20, ] <- NA
 
library(ranger)
m <- ranger(ntl~., data=as.data.frame(s, na.rm=TRUE), mtry=1)
p <- predict(s, m)
#Error: Missing data in columns: covar.
#In addition: Warning message:
#In lapply(r, as.numeric) : NAs introduced by coercion

Solution

  • You can use na.rm=TRUE

    library(terra)
    #terra 1.6.53
    s <- rast(system.file("ex/logo.tif", package="terra")) [[1:2]] 
    names(s) = c("ntl", "covar")
    s[10:20, ] <- NA
    
    library(ranger)
    m <- ranger(ntl~., data=as.data.frame(s, na.rm=TRUE), mtry=1)
    p <- predict(s, m, na.rm=TRUE)
    
    p
    #class       : SpatRaster 
    #dimensions  : 77, 101, 1  (nrow, ncol, nlyr)
    #resolution  : 1, 1  (x, y)
    #extent      : 0, 101, 0, 77  (xmin, xmax, ymin, ymax)
    #coord. ref. : Cartesian (Meter) 
    #source(s)   : memory
    #name        :  prediction 
    #min value   :   0.2525767 
    #max value   : 254.6400884