Search code examples
rnetcdfncdf4

Nearest Non-Missing Value in NetCDF


I have two data sources for my project. I get my lat/lon values from a survey (DHS), and I'm using those lat/lon values to extract weather data from ERA-5 Land NetCDF files (spatial resolution: 0.1 x 0.1, or ~9km). The lat/lon values from the survey are randomly displaced between 2km and 10 kms, leading to some points being close to oceans/seas/other water sources. Since ERA-5 Land data are land-related variables, there are missing values close to water sources.

I use the following code to extract data from the NetCDF File to the nearest lat/lon point:

library(tidyverse)
library(ncdf4)

nc <- nc_open("./Weather/2019_07_18.nc")

lat <- 14.67543
lon <- -17.4484

ncvar_get(nc, varid = "t2m",
          start= c(which.min(abs(nc$dim$longitude$vals - lon)),
                   which.min(abs(nc$dim$latitude$vals - lat)),
                   1),
                   count = c(1,1,-1))

This specific lat/lon returns all NAs, because the data are missing in the NetCDF file (confirmed by looking at the file in Panoply).

Question: How can I modify this code so that it extracts data from the nearest non-missing lat/lon in the NetCDF file? There are two ways that I have conceptualized going about this issue: 1) cut all missings from the NetCDF file, leading the code to pull the nearest non-missing value or 2) extrapolate the values in the NetCDF file so that there are no missing values. I'm thinking that cutting all missings would be less computer-intensive, but I'm willing to go another route. Thanks for your help.


Solution

  • Figured out an answer to my question after another couple of days. I'm posting this in case there's anyone else out there running into the same issue.

    The key is to use CDO (Climate Data Operators) to replace the missings in the NetCDFs. There are two options to replace the missings: nearest neighbor and inverse distance weighting. Missing values are talked about in section 2.6.15 in the CDO Documentation: https://code.mpimet.mpg.de/projects/cdo/embedded/index.html#x1-3440002.6.15

    To use CDO on a Windows, use Cygwin. You install the CDO package on Cygwin setup. The code is simply:

    cdo setmisstonn infile outfile
    

    OR

    cdo setmisstodis infile outfile
    

    With the infile being the location of the input, and the outfile being the location of the file with no missings.

    In order to loop through multiple files within a folder, you can use:

    for i in $(ls); do cdo setmisstonn ${i} ${i}_nona; done