How do I remove all lines starting from the beginning until I reach a certain pattern, except from the last one

Example:

>"one"
>"two"
>"three"
>"title"
>12 23 14
>...

I want to remove all lines at the beginning until I reach the one in which NF==3 (awk), but the line named "title", and just once at the beginning of the file, not repeatedly.

Thank you

Expected output:

>"title"
>12 23 14
>...

Solution

The way to do this is by using awk as you already suggested. As you say, you want to print the lines starting from the first occurrence where you have 3 fields, this can easily be done by setting a print flag (let's call it p)'

awk '(NF==3){p=1};p' file

This will print everything starting from the first line with 3 fields.

However, you would also like to print the line which contains the string "title". This can be done by matching this string :

awk '/title/{print}(NF==3){p=1};p' file

The problem with this is that it is possible that the word 'title' will be printed twice when your file looks like

a          < not printed
title      < printed
a b c      < printed
title      < printed twice
e f g      < printed
h          < printed

So you have to be a bit more careful here with your logic and place the check together with the check when to print:

awk '(NF==3){p=1};(p || /title/)' file

This again is not robust because you might have a file like:

a          < not printed
title 1    < printed
b          < not printed
title 2    < printed
a b c      < printed
h          < printed

and you only want "title 2" to be printed:

awk '/title/{s=$0}(NF==3){p=1;print s};p' file

If the "title" just refers to the line before the first line with 3 fields, then you do

awk '(NF==3){p=1;print s};p;{s=$0}' file

or for a minor speedup:

awk '(NF==3){p=1;print s};p{print; next}{s=$0}' file