Search code examples
sedtcsh

Using sed to copy data between two numerical patterns to a new file


I'm running a bunch (~320) computational chemistry experiments and I need to pull a small amount of the data out of each of the files so that I can do some work on it in MatLab.

I'm pretty sure I can use sed to make this work, but try as I might I don't seem to be able to do so.

I need all of the data starting at the line beginning with "1 1" and ending with the line starting with "33 33".

 I  J      FI(I,J)      k(I,J)       K(I,J)

 1  1       -337.13279    -0.06697    -0.00430
 2  2       3804.89120     8.52972     0.54787
 3  3       3195.69653     6.01702     0.38648
 4  4       3189.18684     5.99253     0.38490
 5  5       3183.73262     5.97205     0.38359
 6  6       3174.47525     5.93737     0.38136
 7  7       3167.88746     5.91275     0.37978
 8  8       1628.80868     1.56311     0.10040
 9  9       1623.56055     1.55306     0.09975
10 10       1518.21620     1.35806     0.08723
11 11       1476.93012     1.28520     0.08255
12 12       1341.24087     1.05990     0.06808
13 13       1312.30373     1.01466     0.06517
14 14       1264.73004     0.94242     0.06053
15 15       1185.62592     0.82822     0.05320
16 16       1175.54013     0.81419     0.05230
17 17       1170.41211     0.80710     0.05184
18 18       1090.20196     0.70027     0.04498
19 19       1039.29190     0.63639     0.04088
20 20       1015.00116     0.60699     0.03899
21 21       1005.05773     0.59516     0.03823
22 22        986.55965     0.57345     0.03683
23 23        917.65537     0.49615     0.03187
24 24        842.93089     0.41863     0.02689
25 25        819.00146     0.39520     0.02538
26 26        758.39720     0.33888     0.02177
27 27        697.11173     0.28632     0.01839
28 28        628.75684     0.23292     0.01496
29 29        534.75856     0.16849     0.01082
30 30        499.35579     0.14692     0.00944
31 31        422.01320     0.10493     0.00674
32 32        409.30255     0.09870     0.00634
33 33        227.12411     0.03039     0.00195

  33 2nd derivatives larger than 0.371D-04 over     561

MatLab is not a fan of text, so I'd like to not use text delimiters (though there are some in the header of this data section) and keep the data contained to only the numeric lines.

The data files contain a lot of other numbers as well, so I need to match the occurrence of "1 1" at the start of the line and "33 33" as the end of the copy. These 'indices' exist only in this block of info.

I attempted to use

% sed -n /"1 1"/,/"33 33"/p input.file > output.file

But I get a WHOLE BUNCH of data in the output file as it copies everything that shows up between any "1" and "33"

Is there any way to do what I'm looking for?

Also, I'm using the tcsh as that is what my servers run.


Solution

  • How about using awk

    awk '$1=="1"&&$2=="1"{t=1};t;$1=="33"&&$2=="33"{t=0}' file
    

    Recommand by @mklement0, if there is only one block, to avoid processing the remainder of the file you can update the command to:

    awk '$1=="1"&&$2=="1"{t=1};t;$1=="33"&&$2=="33"{exit}' file