Search code examples
linuxbashnetcdfcdo-climatenco

how to I add a specific variable from a list of netcdf files to a different one?


I have already realized that I can use the command ncks in my terminal to transfer a variable I need (temperature) from one .nc file to another that has the same structure but instead of temperature it has current data (uo, vo). My question is, how do I create a loop that takes the temperature variable from each of these temperature .nc files and transfer it to the sea current data of the same date?

I have already tried creating the below script, although because i'm a beginner in programming I am well aware that it is probably very wrong, but this is the logic that I am going for

for filei in $(ls *TEMP-MFSe3r1-MED-b20200901_re-sv01.00.nc) ; do
        for filej in $(ls *RFVL-MFSe3r1-MED-b20200901_re-sv01.00.nc) ; do
                ncks -v thetao $filei $filej
        done
done

The code that I have written doesn't succeed in the purpose that I wanted for it. While I would like for the code to take each *TEMP netcdf file, take from it the temperature variable and append it to the equivalent RFVL file, what it does is run the loop too many times and try to append that temperature variable to all the *RFVL files. In addition, I realize that I never specified what my files are named, here is an example:

20130701_d-CMCC--RFVL-MFSe3r1-MED-b20200901_re-sv01.00_crop.nc
20130701_d-CMCC--TEMP-MFSe3r1-MED-b20200901_re-sv01.00_crop.nc

The command in question that I would like to loop, which works as intended is:

ncks -v thetao *TEMP*.nc *RFVL*.nc

which is, it takes from *TEMP*.nc file the thetao variable and appends it to the *RFVL*.nc file, the with the only slight problem that after I execute this command it asks me:

ncks: 20130701_d-CMCC--RFVL-MFSe3r1-MED-b20200901_re-sv01.00_crop.nc exists---'e'xit, 'o'verwrite (i.e., clobber existing file), or 'a'ppend (i.e., replace duplicate variables in, and add metadata and new variables to, existing file) (e/o/a)?

in order to know how I would like to proceed, to which I reply with: a, but I would prefer to find a way to also automate from the command itself so it doesn't ask me every time. Just to clarify, the two sets of files that i'm working with have the exact same name except of the 4 letters in the middle that indicate the type of data stored on the files.

In case someone has a better method to suggest, what I basically want to do is from every *TEMP file, which includes the variable thetao, the variable for potential heat, I want to append that variable to the equivalent (of the same day) *RFVL file, in order to create a list of netcdf files that have temperature and sea currrent data in them (I have 60 files of each type)

I apologize for the incomplete question, this is my first time posting on stack overflow.


Solution

  • Assumptions:

    • each TEMP file has a matching RVFL file
    • each matching pair of RFVL / TEMP files has the same exact name, except for the 4-character RFVL vs TEMP
    • the 4-character RFVL / TEMP only shows up once in a file name (ie, we don't have to worry about names like 123_TEMP_TEMP_789.nc or abc_RFVL_TEMP_xyz.nc)
    • OP's sample ncks call is valid and does what OP wants it to (when provided with matching RVFL / TEMP files)
    • each TEMP file is guaranteed to have a value for the variable thetao; otherwise OP will need to provide additional details on how to determine if, and how to process, a TEMP file does not have a value for thetao
    • don't have to worry about OP running this process twice thus appending a duplicate thetao entry to the RVFL file; otherwise OP will need to provide more details on how to determine if, and how to process, a RVFL file that already contains a thetao entry

    General design:

    • get a list of TEMP file names
    • make a copy of the TEMP file name and replace TEMP with RFVL
    • run the ncks command

    One idea:

    for src in *TEMP*.nc
    do
        tgt="${src/TEMP/RFVL}"
        ncks -v thetao "${src}" "${tgt}"
    done