Search code examples
rcsvmaxentdismo

Using a test sample file with MaxEnt in R


I worked a lot with MaxEnt in R recently (dismo-package), but only using a crossvalidation to validate my model of bird-habitats (only a single species). Now I want to use a self-created test sample file. I had to pick this points for validation by hand and can't use random test point.

So my R-script looks like this:

library(raster)
library(dismo)

setwd("H:/MaxEnt")

memory.limit(size = 400000)

punkteVG <- read.csv("Validierung_FL_XY_2016.csv", header=T, sep=";", dec=",")
punkteTG <- read.csv("Training_FL_XY_2016.csv", header=T, sep=";", dec=",")

punkteVG$X <- as.numeric(punkteVG$X)
punkteVG$Y <- as.numeric(punkteVG$Y)

punkteTG$X <- as.numeric(punkteTG$X)
punkteTG$Y <- as.numeric(punkteTG$Y)

##### mask NA ######
mask <- raster("final_merge_8class+le_bb_mask.img")
dataframe_VG <- extract(mask, punkteVG)
dataframe_VG[dataframe_VG == 0] <- NA

dataframe_TG <- extract(mask, punkteTG)
dataframe_TG[dataframe_TG == 0] <- NA

punkteVG <- punkteVG*dataframe_VG
punkteTG <- punkteTG*dataframe_TG

#### add the raster dataset ####
habitat_all <- stack("blockstats_stack_8class+le+area_8bit.img")

####  MODEL FITTING #####
library(rJava)
system.file(package = "dismo")
options(java.parameters = "-Xmx1g" )

setwd("H:/MaxEnt/results_8class_LE_AREA")

### backgroundpoints ###
set.seed(0) 
backgrVMmax <- randomPoints(habitat_all, 100000, tryf=30)
backgrVM <- randomPoints(habitat_all, 1000, tryf=30)

### Renner (2015) PPM modelfitting Maxent ###
maxentVMmax_Renner<-maxent(habitat_all,punkteTG,backgrVMmax, path=paste('H:/MaxEnt/Ergebnisse_8class_LE_AREA/maxVMmax_Renner',sep=""),
                       args=c("-P", 
                              "noautofeature", 
                              "nothreshold", 
                              "noproduct",
                              "maximumbackground=400000",
                              "noaddsamplestobackground",
                              "noremoveduplicates",
                              "replicates=10", 
                              "replicatetype=subsample",
                              "randomtestpoints=20",
                              "randomseed=true",
                              "testsamplesfile=H:/MaxEnt/Validierung_FL_XY_2016_swd_NA"))

After the "maxent()"-command I ran into multiple errors. First I got an error stating that he needs more than 0 (which is the default) "randomtestpoints". So I added "randomtestpoints = 20" (which hopefully doesn't stop the program from using the file). Then I got:

Error: Test samples need to be in SWD format when background data is in SWD format
Error in file(file, "rt") : cannot open the connection

The thing is, when I ran the script with the default crossvalidation like this:

maxentVMmax_Renner<-maxent(habitat_all,punkteTG,backgrVMmax, path=paste('H:/MaxEnt/Ergebnisse_8class_LE_AREA/maxVMmax_Renner',sep=""),
                       args=c("-P", 
                              "noautofeature", 
                              "nothreshold", 
                              "noproduct",
                              "maximumbackground=400000",
                              "noaddsamplestobackground",
                              "noremoveduplicates",
                              "replicates=10"))

...all works fine.

Also I tried multiple things to get my csv-validation-data in the correct format. Two rows (labled X and Y), Three rows (labled species, X and Y) and other stuff. I would rather use the "punkteVG"-vector (which is the validation data) I created with read.csv...but it seems MaxEnt wants his file.

I can't imagine my problem is so uncommon. Someone must have used the argument "testsamplesfile" before.


Solution

  • I found out, what the problem was. So here it is, for others to enjoy:

    The correct maxent-command for a Subsample-file looks like this:

    maxentVMmax_Renner<-maxent(habitat_all, punkteTG, backgrVMmax, path=paste('H:/MaxEnt',sep=""),
                           args=c("-P", 
                                  "noautofeature", 
                                  "nothreshold", 
                                  "noproduct",
                                  "maximumbackground=400000",
                                  "noaddsamplestobackground",
                                  "noremoveduplicates",
                                  "replicates=1",
                                  "replicatetype=Subsample",
                                  "testsamplesfile=H:/MaxEnt/swd.csv"))
    

    Of course, there can not be multiple replicates, since you got only one subsample. Most importantly the "swd.csv" Subsample-file has to include:

    • the X and Y coordinates
    • the Values at the respective points (e.g.: with "extract(habitat_all, PunkteVG)"
    • the first colum needs to consist of the word "species" with the header "Species" (since MaxEnt uses the default "species" if you don't define one in the Occurrence data)

    So the last point was the issue here. Basically, if you don't define the species-colum in the Subsample-file, MaxEnt will not know how to assign the data.