Search code examples
rhistogramrastercategorical-dataterra

Use R to create a historgram of categorical raster values? (or, create data table with lat/long values)


I am very new to R and programming in general, please forgive my forthcoming ineptitude.

I am working with a large categorical raster. Essentially, every pixel shallower than 10 meters on the Great Barrier Reef is assigned a value: 11,12,13, or 15. (Original file here). My end goal is to create a histogram showing the frequency of the "rubble" category (which is the given by the value 12) by latitude. It would look very similar to the third panel of this figure, but where they show "coral habitat" I would be showing rubble.

I thought the best way to do this would be to try to convert the original raster into a data frame where each row represents what was a pixel and there are three columns: categorical value (11,12,13, or 15), latitude, and longitude. I could then use this data frame to create any number of basic plots.

Ideally I would like to omit NAs in the process of creating this data frame because the raster is 152,505 by 112,421, but over 99% of pixels are empty (due to the shape of the Great Barrier Reef).

I can easily read in the raster and plot it using the Raster or Terra packages:

benthic <- ("data/GBR10 GBRMP Benthic.tif")
benthic_r <-rast(benthic)
plot(benthic_r)

I tried using this to make it smaller so that future computations will be easier.

benthic2 <- na.omit(benthic)
benthic2_r <- rast(benthic2)

But found na.omit is for vector data only, so that was not successful.

I am trying to use Geocomputation with R and thought perhaps I would use the as.data.frame function somehow, but I have not been able to find a way to use that to create the sort of table I want.

I have also been browsing information about the Stars and RasterVis packages. I tried to skip over the data frame idea and create a histogram directly from the orginal raster using the rasterVis package:

dirY <- xyLayer(benthic_r, y) #trying to assign y axis to be the y values of the pixels (which are hopefully latitude...)
dirXY <- xyLayer(count(12), benthic_r) #trying to assign x values to be a count of the number of pixels with a value of 12

histogram(benthic_r, dirXY,
maxpixels = 1e+05,
strip=TRUE,
par.settings=rasterTheme(),
att = 1)

#attempting to follow https://cran.r-project.org/web/packages/rasterVis/rasterVis.pdf "histogram methods" section

I have tweaked this in many ways but everything just results in a new error.

I am not expecting a full code solution, rather if you could help me understand where to look and how to progressively build up to creating that graph that would be much appreciated. Is attempting to create a data frame first a good approach? Or is that totally unnecessary?

The resources I find on how to work with Raster data seem to all be about manipulating satellite imagery (continuous rasters), I have not found many techniques for manipulating categorical rasters. If you know of good tutorials for working with categorical rasters that would be much appreciated too.

Thanks in advance for any advice.


Solution

  • I think the easiest way is to aggregate the raster so that you get one column, with the value of interest. That is, one cell for each latitude (row). Here is an example:

    # example data 
    library(terra)
    r <- rast(nrow=18)
    values(r) <- sample(c(11,12,13,15), ncell(r), replace=TRUE)
    
    # count the number of cells that are 12
    r12 <- r == 12    
    
    # aggregate the columns
    a <- aggregate(r12, c(1, ncol(r)), sum)
    
    # get the latitude for each row
    lat <- yFromRow(a, 1:nrow(a))
    

    You can make different types of plots. Here are two examples. I prefer latitude on the vertical axis.

    plot(values(a), lat, ylab="latitude", xlab="count")
    
    barplot(rev(values(a)[,1]), horiz=T, names.arg=rev(lat), las=1)