Search code examples
linuxbashawksedash

Extract the first occurency of text between 2 patterns


I have a text file like this

----PAT1----
textaa1
textbb1
textcc1
.......
----PAT2----
----PAT1----
textaa2
textbb2
textcc2
.......
----PAT2----

I want to extract the first text between "----PAT1----" and "----PAT2----" icluding both patterns

So the output will be:

----PAT1----
textaa1
textbb1
textcc1
.......
----PAT2----

How to do that with sed or awk ?

I tried the following but it does not work

sed -n '/PAT1/,/PAT2/p' file

Other questions are showing how to extract all patterns, but they are not indicating how to extract only the first one


Solution

  • One awk possibility would be something like

    awk '/PAT1/ {f=1} /PAT2/ {print; exit} f' file
    

    It would be more complicated to exclude that match, but we could do a similar approach, where we use a flag to decide whether or not to print the line and use the patterns to toggle that flag:

    awk 'BEGIN {f=1} /PAT1/ {if(first == 0) {f=0}; first=1} /PAT2/ {if(f == 0) {f=1; next}}' f' file
    

    That would print every line except the first block of lines between the patterns, including the pattern lines themselves.