Search code examples
rdataframegridmeanraster

Converting dataframe with multiple dimensions to raster layer


I am creating a raster layer for an area with multiple environmental variables. All data formats have usually been netCDF files (arrays) containing lat, long, date and the variable in question - in this case sea_ice_fraction.

The data for sea surface temperature (sst), came in an understandable format, at least from the point of view of trying to make a prediction grid:

, , Date = 2019-11-25

         Long
Lat           294.875 295.125 295.375 295.625 295.875 296.125    296.375 296.625 296.875     297.125
  -60.125  2.23000002    2.04    1.83    1.53    1.18    1.00  0.9800000    1.06    1.25  1.40999997
  -60.375  2.06999993    1.79    1.60    1.31    1.09    0.97  1.0000000    1.15    1.30  1.42999995
  -60.625  1.93999994    1.64    1.45    1.28    1.14    1.02  0.9899999    1.03    1.10  1.13000000

Each row is one single latitude coordinate (of the resolution of the data), and each column is a longitude coordinate paired with the date.

My goal is to calculate the mean of all the date-values for each coordinate cell. Which in the array case is easy:

sst.c1 <- apply(sst.c1, c(1,2), mean)

Then project to a Raster layer

However, the format of the sea ice data is in a dataframe, with 4 columns: lat, long, date, and sea_ice_fraction:

   time                   lat   lon sea_ice_fraction
   <chr>                <dbl> <dbl>            <dbl>
 1 2019-11-25T12:00:00Z -66.1 -65.1            0.580
 2 2019-11-25T12:00:00Z -66.1 -65.1           NA    
 3 2019-11-25T12:00:00Z -66.1 -65.0           NA    
 4 2019-11-25T12:00:00Z -66.1 -65.0           NA    
 5 2019-11-25T12:00:00Z -66.1 -64.9           NA    

How can I turn this dataframe into an array similar to the sst data? Or directly into a raster finding the mean of the values for the dates per cell in the dataframe?


Solution

  • Can you not just do this using dplyr?

    The following should work fine:

    library(dplyr)
    df %>%
       group_by(lat, lon) %>%
       summarize(sea_ice_fraction = mean(sea_ice_fraction)) %>%
       ungroup()
    

    should work fine