Search code examples
rspatialr-rasterterra

R packages raster and terra give different min and max values for a raster


I have a large raster file (933120000 cells). When calculating min and max cell values, I get the same values (0 and 90) when using raster and terra packages in R when the file is on disk (not in memory). However, loading the file into memory and getting the min and max values again, I get min=1 and max=92 using raster, but still 0 and 90 using terra - even though I force terra to recompute the min and max. Reprex for the code shown below (note ~20GB of memory needed to load the raster into memory.

library(raster)
#> Loading required package: sp
library(terra)
#> terra 1.6.17

# create a temporary filename for the downloaded file
f <- file.path(tempdir(), "coral.tif")

#increase timeout to 30mins to allow sufficient time to download the file which is 110MB  -download can be slow
options(timeout = 30*60)

download.file("https://store.pangaea.de/Publications/YessonC-etal_2016/YessonEtAl_DSR2_2016_AntipathariaHSM.tif", f)

#load via terra and raster
coral_terra <- rast(f)
coral_raster <- raster(f)

# all the following give min = 0 and max = 90 for the raster
minmax(coral_terra)
#>     coral
#> min     0
#> max    90

minValue(coral_raster)
#> [1] 0
maxValue(coral_raster)
#> [1] 90

rgdal::GDALinfo(f)
#> rows        21600 
#> columns     43200 
#> bands       1 
#> lower left origin.x        -180 
#> lower left origin.y        -90 
#> res.x       0.008333333 
#> res.y       0.008333333 
#> ysign       -1 
#> oblique.x   0 
#> oblique.y   0 
#> driver      GTiff 
#> projection  +proj=longlat +datum=WGS84 +no_defs 
#> file        /tmp/Rtmp7UvYjv/coral.tif 
#> apparent band summary:
#>   GDType hasNoDataValue NoDataValue blockSize1 blockSize2
#> 1   Byte           TRUE           0          1      43200
#> apparent band statistics:
#>   Bmin Bmax    Bmean     Bsd
#> 1    0   90 5.291753 10.0304
#> Metadata:
#> AREA_OR_POINT=Area 
#> Authors=Yesson C, Bedford F, Rogers AD, Taylor ML 
#> Description=Habitat suitability prediction for black corals in deep (50m+) oceans.  Suitability ranges from 0-unsuitable to 100-highly suitable. 
#> doi=10.1016/j.dsr2.2015.12.004 
#> Journal=Deep Sea Research II 
#> Threshold=23 
#> Title=The global distribution of deep-water Antipatharia habitat

#read everything to disk - this requires about 20GB of memory due to the large raster size
coral_raster_on_disk <- readAll(coral_raster)

minValue(coral_raster_on_disk, min) #min is now 1
#> [1] 1
maxValue(coral_raster_on_disk) #max is now 92
#> [1] 92

#large file so remove it
rm(coral_raster_on_disk)
gc()
#>           used  (Mb) gc trigger    (Mb)   max used    (Mb)
#> Ncells 2443830 130.6    4850928   259.1    4331079   231.4
#> Vcells 3274712  25.0 1348511069 10288.4 1633177298 12460.2

#read everything to disk for terra
set.values(coral_terra)

setMinMax(coral_terra, force=TRUE)
minmax(coral_terra) #min is still 0 and max is still 90
#>     coral
#> min     0
#> max    90

#remove the file
unlink(f)

Created on 2023-03-01 with reprex v2.0.2

Running the "Raster layer statistics" tool in QGIS, I get min = 1 and max=92, so these seem to be the correct values

Anybody know why terra is not correctly calculating the min and max?


Solution

  • The data

    url <- "https://store.pangaea.de/Publications/YessonC-etal_2016/YessonEtAl_DSR2_2016_AntipathariaHSM.tif"
    f <- basename(url)
    if (!file.exists(f)) download.file(url, f, mode="wb")
    

    The problem is that the metadata stored in the file are only approximate

    library(terra)
    #terra 1.7.15
    
    describe(f)[52]
    #[1] "  Min=0.000 Max=90.000 "
    

    You can compute ignore these values and compute them with terra::setMinMax(x, force=TRUE) but that was not working. I fixed that in the development version, and I now get:

    r <- rast(f)
    setMinMax(r, force=TRUE)
    
    r
    #class       : SpatRaster 
    #dimensions  : 21600, 43200, 1  (nrow, ncol, nlyr)
    #resolution  : 0.008333333, 0.008333333  (x, y)
    #extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
    #coord. ref. : lon/lat WGS 84 (EPSG:4326) 
    #source      : YessonEtAl_DSR2_2016_AntipathariaHSM.tif 
    #name        : YessonEtAl_DSR2_2016_AntipathariaHSM 
    #min value   :                                    1 
    #max value   :                                   92 
    

    With earlier versions, you can also force the computation of these values with:

    r <- r + 0
    

    set.values reads the values into memory but it does not re-compute the min and max values.