I have a certain pattern in my file as so:
....
BEGIN
any text1
any text2
END
....
BEGIN
any text3
garbage text
any text4
END
....
BEGIN
any text5
any text6
END
...
BEGIN
and END
are my markers, and I want to extract all the text between the markers only if the block does not contain 'garbage text
'. So my expectation is to extract the blow blocks:
any text1
any text2
any text5
any text6
How do I do it in awk? I know I can do something like:
awk '/BEGIN/{f=1;next}/END/{f=0;}f' file.log
to extract the lines between the two markers, but how do I further refine the results by further filtering based on absence of 'garbage text
'?
$ awk '/END/{if (rec !~ /garbage text/) print rec} {rec=rec $0 ORS} /BEGIN/{rec=""}' file
any text1
any text2
any text5
any text6
The above assumes every END is paired with a preceding BEGIN. WIth GNU awk for multi-char RS you could alternatively do:
$ awk -v RS='END\n' '{sub(/.*BEGIN\n/,"")} RT!="" && !/garbage text/' file
any text1
any text2
any text5
any text6
btw instead of:
awk '/BEGIN/{f=1;next}/END/{f=0;}f' file.log
your original code should be just:
awk '/END/{f=0} f; /BEGIN/{f=1}' file.log
See Printing with sed or awk a line following a matching pattern for related idioms.