Search code examples
rbuffergisr-rasterproportions

combining lists and dataframes in R from raster values


QUESTION EDITED FOR CLARITY AND REPRODUCIBILITY

I am trying to summarize proportions of landcover classes within many buffers contained within a list. Although it appears to be a common problem, I have not found an appropriate solution:

I have a raster stack called hab_stack with discrete values 1-6 for each of 3 layers (each layer == year). I also have locational data with >800,000 locations called dat_sf. I have extracted hab_stack raster values within a 400 m buffer around each location.

I now have a large list with ~800,000 elements (not all hab classes 1-6 are represented in each list). So I tried to create a dummy dataframe with all hab_stack values 1-6 called true_names with assigned frequency/proportion == zero for classes not represented within the buffer because I need to combine all proportions together. I have tried to accomplish this using an lapply looping structure but can't seem to get it quite right. Below is the full function and error:

sum_class <- lapply(values_hab, function(x){
  
  true_names <- data.frame(x = 1:6, Freq = 0)
  
    prop_df <- as.data.frame(prop.table(table(x))) %>%
    mutate(x = as.numeric(x))
  
  true_names %>%
    anti_join(prop_df, by = "x") %>%
    bind_rows(prop_df) %>%
    arrange(x)

Error in `mutate()`:
! Problem while computing `x = as.numeric(x)`.
x `x` must be size 0 or 1, not 1659.
Run `rlang::last_error()` to see where the error occurred.
})

When I dissect the function, the error arises from the table(values_hab) argument = Error in table(values_hab) : all arguments must have the same length.

I think a hypothetical list could look something like this, where there's different numbers of NAs and not all classes are represented in each element; also, see a dataframe of my desired output below:

list <- list(c(1,1,1,2,2,2,3,3,4,4,4,NA,NA,NA,5,6),
c(1,2,3,4,NA,NA,NA,NA,4,4,4,4,NA,5,1,1)
c(5,5,5,5,5,1,2,2,2,2,NA,NA,NA,NA,NA,3))

desired_output <- data.frame(`1` = c(0.4, 0.5, 0.6, 0.5, 0.5, 0.3),
`2` = c(0.1, 0.1, 0.1, 0.1, 0.1, 0.2),
`3` = c(0.1, 0.1, 0.0, 0.1, 0.0, 0.3),
`4` = c(0.3, 0.2, 0.0, 0.1, 0.1, 0.1),
`5` = c(0.0, 0.1, 0.2, 0.2, 0.1, 0.0),
`6` = c(0.1, 0.0, 0.1, 0.0, 0.2, 0.1))

Any help is much appreciated. Best,


Solution

  • It looks like my function works and this was a very easy fix. dplyr::mutate was recognizing x as the entire list when in fact I wanted it to apply mutate the vector x within each list. R is still running in the background but this should have taken care of it.

    sum_class_function <- function(x){
      
      true_names <- data.frame(x = 1:6, Freq = 0)
      
      prop_df <- as.data.frame(prop.table(table(x))) 
      prop_df$x <- as.numeric(prop_df$x)
      
      temp<- true_names %>%
        anti_join(prop_df, by = "x") %>%
        bind_rows(prop_df) %>%
        arrange(x)
      
      return(temp)
    }
    
    sum_class <- lapply(values_hab, sum_class_function)