Search code examples
rrasterterra

How can I prevent an extra level from being added when naming categorical levels in terra SpatRaster?


I have a categorical raster which has 21 categories:

>lc_2003

class       : SpatRaster 
dimensions  : 685, 827, 1  (nrow, ncol, nlyr)
resolution  : 4477.525, 4477.525  (x, y)
extent      : 4224289, 7927202, 2358424, 5425529  (xmin, xmax, ymin, ymax)
coord. ref. : NAD83 / Statistics Canada Lambert 
source      : memory 
name        : PHYSIOG 
min value   :       1 
max value   :      21 

>levels(lc_2003)
[[1]]
[1] ""

I want to give a name to each of the categorical levels. However, when I do this, an extra level appears to be added, so that now there are 22 levels. There is an extra "NA" level at the beginning of the list.

landcover_classes_cavmVec <- data.frame(lc_code = 1:21, lc_class = c("Cryptogam, herb barren", "Rush/grass, forb, cryptogam tundra", "Cryptogam barren complex (bedrock)",
                                                                     "Prostrate dwarf-shrub, herb tundra", "Graminoid, prostrate dwarf-shrub, forb tundra", "Prostrate/Hemiprostrate dwarf-shrub tundra", "Nontussock sedge, dwarf-shrub, moss tundra",
                                                                     "Tussock-sedge, dwarf-shrub, moss tundra", "Erect dwarf-shrub tundra", "Low-shrub tundra", "Missing (Cryprogram dwarf-shrub?)", "Sedge/grass, moss wetland",
                                                                     "Sedge, moss, dwarf-shrub wetland", "Sedge, moss, low-shrub wetland", "Noncarbonate mountain complex", "Carbonate mountain complex", "Nunatak complex",
                                                                     "Glaciers", "Water", "Lagoon", "Non-Arctic areas"))

>levels(lc_2003) <- landcover_classes_cavmVec

> levels(lc_2003)
[[1]]
 [1] NA                                              "Cryptogam, herb barren"                       
 [3] "Rush/grass, forb, cryptogam tundra"            "Cryptogam barren complex (bedrock)"           
 [5] "Prostrate dwarf-shrub, herb tundra"            "Graminoid, prostrate dwarf-shrub, forb tundra"
 [7] "Prostrate/Hemiprostrate dwarf-shrub tundra"    "Nontussock sedge, dwarf-shrub, moss tundra"   
 [9] "Tussock-sedge, dwarf-shrub, moss tundra"       "Erect dwarf-shrub tundra"                     
[11] "Low-shrub tundra"                              "Missing (Cryprogram dwarf-shrub?)"            
[13] "Sedge/grass, moss wetland"                     "Sedge, moss, dwarf-shrub wetland"             
[15] "Sedge, moss, low-shrub wetland"                "Noncarbonate mountain complex"                
[17] "Carbonate mountain complex"                    "Nunatak complex"                              
[19] "Glaciers"                                      "Water"                                        
[21] "Lagoon"                                        "Non-Arctic areas" 

How do I prevent this from happening?

Bonus question: what would be a good strategy to check if the lc_code variable in landcover_classes_cavmVec did in fact match up with the equivalent number in the original 1:21 lc_2003 levels? I have no idea how I would verify if this code did what I wanted it to.


Solution

  • This is copied from ?terra::levels

    library(terra)
    r <- rast(nrows=10, ncols=10)
    set.seed(0)
    values(r) <- sample(3, ncell(r), replace=TRUE)
    

    We have a raster with values 1, 2, and 3.

    You can assign categories like this

    cls <- c("forest", "water", "urban")
    d <- data.frame(id=1:3, cover=cls)
    levels(x) <- d
    x
    
    levels(x)
    #[[1]]
    #[1] NA       "forest" "water"  "urban" 
    
    cats(x)
    #[[1]]
    #  ID  cover
    #1  0   <NA>
    #2  1 forest
    #3  2  water
    #4  3  urban
    

    So there is an additional category for cells with value zero. This is to accommodate how categories are read by and can be written the GDAL library (it assumes values between 0 and 255). That is what you see, and that is correct.

    Alternatively, you can change the raster such that it start at zero

    x <- r - 1
    levels(x) <- cls
    names(x) <- "land cover"
    levels(x)
    #[[1]]
    #[1] "forest" "water"  "urban" 
    
    cats(x)
    #[[1]]
    #  ID category
    #1  0   forest
    #2  1    water
    #3  2    urban
    

    The manual also shows the approach below. That is, add another (empty) label for the missing level (zero in this case).

    levels(r) <- c("", cls)