Search code examples
rraster

Image classification (raster stack) with random forest (package ranger)


I'm fitting a random forest using the R package ranger to classify a raster image. The prediction function produces an error and hereafter I provide a reproducible example.

library(raster)
library(nnet)
library(ranger)
data(iris)

# put iris data into raster
r<-list()
for(i in 1:4){
  r[[i]]<-raster(nrows=10, ncols=15)
  r[[i]][]<-iris[,i]
}
r<-stack(r)
names(r)<-names(iris)[1:4]

# multinom (an example that works)
nn.model <- multinom(Species ~ ., data=iris, trace=F)
nn.pred<-predict(r,nn.model)

# ranger (doesn't work)
ranger.model<-ranger(Species ~ ., data=iris)   
ranger.pred<-predict(r,ranger.model)

The error given is

Error in v[cells, ] <- predv : incorrect number of subscripts on matrix

although the error with my real data is

Error in p[-naind, ] <- predv : number of items to replace is not a multiple of replacement length

The only thing that crosses my mind is that the ranger.prediction object includes several elements other than the predictions of interest. Anyway, how ranger could be used to predict on a raster stack?


Solution

  • Edit, 2021-07-15

    There was a question about using clusterR, and I have found a more straightforward approach that what I first suggested. The new code does the same thing as the original, but in a simpler way and with an option for parallel processing:

    # First train the ranger model
    
    ranger.model <- ranger(Species ~ .
                           , data = iris
                           , probability = TRUE  # This argument is needed for se
                           , keep.inbag = TRUE   # So is this one
                           )
    
    
    # Create prediction function for clusterR
    
    f_se <- function(model, ...) predict(model, ...)$se
    
    
    # Predict se using clusterR
      
    beginCluster(2)
    
    map_se <- clusterR(r
                       , predict
                       , args = list(ranger.model
                                     , type = 'se'  # Remember to include this argument
                                     , fun = f_se
                                     )
                       )
    
    endCluster()
    

    Original answer, 2018-05-31

    You can run predictions from a ranger model on a raster stack by training the model within the train function of the caret package:

    library(caret)
    ranger.model <- train(Species ~ ., data = iris, method = "ranger")  
    ranger.pred <- predict(r, ranger.model)
    

    However, this doesn't work if you want to predict the standard error, as the prediction function for train objects does not accept type = 'se'. I got around this by building a function for the purpose using this document:

    https://cran.r-project.org/web/packages/raster/vignettes/functions.pdf

    # Function to predict standard errors on a raster
    predfun <- function(x, model, type, filename)
    {
      out <- raster(x)
      bs <- blockSize(out)
      out <- writeStart(out, filename, overwrite = TRUE)
      for (i in 1:bs$n) {
        v <- getValues(x, row = bs$row[i], nrows = bs$nrows[i])
        nas <- apply(v, 1, function(x) sum(is.na(x)))
        p <- numeric(length = nrow(v))
        p[nas > 0] <- NA
        p[nas == 0] <- predict(object = model,
                               v[nas == 0,],
                               type = 'se')$se
        out <- writeValues(out, p, bs$row[i])
      }
      out <- writeStop(out)
      return(out)
    }
    
    # New ranger model 
    ranger.model <- ranger(Species ~ .
                           , data = iris
                           , probability = TRUE
                           , keep.inbag  = TRUE
                           )
    # Run predictions
    se <- predfun(r
                  , model = ranger.model
                  , type  = "se"
                  , filename = paste0(getwd(), "/se.tif")
                  )