I have a file like:
exception: anythinggggg...
exception: anythinggggg...
abchdhjsdhsd
ygsuhesnkc
exception: anythingggg...
exception: anything...
..
..
I want to grep the latest 2 occurrences of exception keyword along with 3 lines before and 3 lines after it.
I am using something like
grep -C 3 exception | tail -12
I am using tail -12 here as I want 6 lines per occurrence and latest 2 occurrences. this works fine when occurrences of exception are far off from each other but gives me useless lines if say both occurrences are consecutive.
abdgjsd
abdgjsd
abdgjsd
abdgjsd
abdgjsd
abdgjsd
abdgjsd
abdgjsd
exception
exception
exception
abcd
In the above case, it gives me
abdgjsd
abdgjsd
abdgjsd
exception
exception
exception
abcd
however, what I want is
abdgjsd
exception
exception -----------------> OUTPUT FOR FIRST OCCURRENCE
exception
abcd
abdgjsd
abdgjsd
exception-----------------> OUTPUT FOR SECOND OCCURRENCE
exception
exception
abcd
Is there another way to this? Probably something in whch I can also specify the number of occurrences and not just grep lines and tail some output from it.
The output you get is because grep
stops printing context (-C
) at the next match. I don't see how to make it behave otherwise.
The script below (written on the command-line) reads the whole file and forms an array of lines. Then it goes through it and prints surrounding two lines for each match, or up to start/end of array.
perl -MList::Util=min,max -0777 -wnE'
@m = split /\n/;
for (0..$#m) {
if ($m[$_] =~ /exception/) {
$bi = max(0,$_-2);
$ei = min($_+2, $#m);
say for @m[$bi..$ei];
say "---"
}
}
' input.txt
The ---
are printed for easier reviewing of output. This prints the desired output.
The -0777
option makes it slurp the whole file into the $_
variable, which is split
by newline. The iteration goes over the array index ($#m
is the index of the last element of @m
). The $bi
and $ei
are begin/end index to print, which cannot be +/- 2 near the beginning and end of the array.
The output can be piped to tail
but this can't be automated: if a match is within the last two lines there'll be (one or two) fewer lines of output so input need be known for precise cut-off. Or find indices of matches in the script, @idx = grep { $m[$_] =~ /exception/} for 0..$#m;
, and use that in the condition to only print the last two.
If you are going to use something like this I'd make it a script. Then read all lines into an array directly, provide command-line options (like -C
in grep
), etc.
Maintaining line-by-line processing would make the job far more complicated. We need to keep track of a match so that we can print the following lines once we read them. But here we need multiple such records -- for the next match(es) as well, if they come within the following lines to be printed.