I have a big file of the following type:
key = asbh
some
lines
of
**text**
key = kafeia
some
more
**text**
and
additionally
more
**text**
and
more
key = lklfh
this
is
another
block
Note (if important): the line of 'key' never contains the string of interest ('text').
I call a block all the lines between one line starting with "key" and the next such line (so in this example, 3 blocks). I would like to return all blocks containing the string 'text'. i.e. desired output:
key = asbh
some
lines
of
**text**
key = kafeia
some
more
**text**
and
additionally
more
**text**
and
more
I tried multiple things and I hope I am in the right direction, but can't seem to get it working. These are my attempts:
less myfile.txt | sed -n '/key/,/text/p' | less
I believe this may start with the first time it sees 'key' and just keeps going (so returns a lot of irrelevant blocks) until it sees 'text'somewhere and stops. This is inspired by a similar question here but that does not have the condition of pulling multiple blocks, nor of matching pattern inside blocks.
less myfile.txt | grep -Pzl '(?s)^key([^key]|\n)*text' | less
I thought this may be better and if I could get it to work, I could probably extend it as it currently attempts to only get the text between key and text (and not until the next key).
I tried understanding how if statements work, particularly in view of this thread, but I am a novice in unix, so if someone could explain, I would be very grateful.
This might work for you (GNU sed):
sed -n '/^key/!{H;$!d};x;/text/p' file
Turn off implicit printing -n
.
If a line does not begin key
, append it to the hold space and delete unless it is the last line.
Otherwise, swap to the hold space and if the collection matches text
, print it.
N.B. The end-of-file condition naturally drops through to the matching condition. The hold/pattern space flip-flops as and when the key
matches.