Search code examples
bashsednetcdfcdo-climate

Any sed option to stop matching when one line number is reached?


I have a table where only even rows are important (bold ones),

year month day time lat lon value
2020 1 1 00:00:00 41.97 29.68 272.4456
2020 1 1 00:00:00 47.97 21.68 -32767
2020 1 1 01:00:00 41.97 29.68 272.3355
2020 1 1 01:00:00 47.97 21.68 -32767
2020 1 1 02:00:00 41.97 29.68 272.1232
2020 1 1 02:00:00 47.97 21.68 -32767
2020 1 1 03:00:00 41.97 29.68 271.8801
2020 1 1 03:00:00 47.97 21.68 -32767

but, at some point, this changes to odd rows. This is the exact moment where it changes (you can see two consecutive "-32767"):

year month day time lat lon value
2023 9 30 23:00:00 41.97 29.68 289.2723
2023 9 30 23:00:00 41.97 29.68 -32767
2023 10 1 00:00:00 41.97 29.68 -32767
2023 10 1 00:00:00 41.97 29.68 288.9512
2023 10 1 01:00:00 41.97 29.68 -32767
2023 10 1 01:00:00 41.97 29.68 288.7689
2023 10 1 02:00:00 41.97 29.68 -32767

I found this out after extracting them with sed '1~2d' site8_slt.nc.tsv. Does anyone know how to sed until a specific line number? This way I would sed two times, first before the change and then after the change.

Thanks in advance!

Context: I am downloading data from cds.climate.copernicus.eu in netCDF format and then converting the data to csv with cdo (Climate Data Operators), and, I dont know why, I get 2 values per row where each value is always the fill value -32767.

I have tried: I know that I can point to specific lines with this $ sed '1s/a/b/' file but this only changes one line, I want to to sed all lines before one in a way, and then sed in another way after one specific line number.


Solution

  • To answer your specific question:

    $ seq 5 | sed -n '1,3p'
    1
    2
    3
    
    $ seq 5 | sed -n '4,$p'
    4
    5
    

    but it sounds like either of these might be more useful for you, using any sed or any awk:

    sed '/-32767$/d' file
    awk '$NF != -32767' file
    

    For example:

    $ cat file
    year    month   day     time    lat     lon     value
    2020    1       1       00:00:00        41.97   29.68   272.4456
    2020    1       1       00:00:00        47.97   21.68   -32767
    2020    1       1       01:00:00        41.97   29.68   272.3355
    2020    1       1       01:00:00        47.97   21.68   -32767
    2020    1       1       02:00:00        41.97   29.68   272.1232
    2020    1       1       02:00:00        47.97   21.68   -32767
    2020    1       1       03:00:00        41.97   29.68   271.8801
    2020    1       1       03:00:00        47.97   21.68   -32767
    2023    9       30      23:00:00        41.97   29.68   289.2723
    2023    9       30      23:00:00        41.97   29.68   -32767
    2023    10      1       00:00:00        41.97   29.68   -32767
    2023    10      1       00:00:00        41.97   29.68   288.9512
    2023    10      1       01:00:00        41.97   29.68   -32767
    2023    10      1       01:00:00        41.97   29.68   288.7689
    2023    10      1       02:00:00        41.97   29.68   -32767
    

    $ sed '/-32767$/d' file
    year    month   day     time    lat     lon     value
    2020    1       1       00:00:00        41.97   29.68   272.4456
    2020    1       1       01:00:00        41.97   29.68   272.3355
    2020    1       1       02:00:00        41.97   29.68   272.1232
    2020    1       1       03:00:00        41.97   29.68   271.8801
    2023    9       30      23:00:00        41.97   29.68   289.2723
    2023    10      1       00:00:00        41.97   29.68   288.9512
    2023    10      1       01:00:00        41.97   29.68   288.7689
    

    $ awk '$NF != -32767' file
    year    month   day     time    lat     lon     value
    2020    1       1       00:00:00        41.97   29.68   272.4456
    2020    1       1       01:00:00        41.97   29.68   272.3355
    2020    1       1       02:00:00        41.97   29.68   272.1232
    2020    1       1       03:00:00        41.97   29.68   271.8801
    2023    9       30      23:00:00        41.97   29.68   289.2723
    2023    10      1       00:00:00        41.97   29.68   288.9512
    2023    10      1       01:00:00        41.97   29.68   288.7689
    

    awk will almost certainly be much more useful to you than sed in doing whatever else you want to do with this data.