I was using the less
command to browse a very huge text log file (15 GB) and was trying to search for a multiline pattern but after some investigation, less
command can only search single line patterns.
Is there a way to use grep
or other commands to return the number line of a multiline pattern?
The format of the log is something like this in iterations of hundred thousands:
Packet A
op_3b : 001
ctrl_2b : 01
ini_count : 5
Packet F
op_3b : 101
ctrl_2b : 00
ini_count : 4
Packet X
op_3b : 010
ctrl_2b : 11
ini_count : 98
Packet CA
op_3b : 100
ctrl_2b : 01
ini_count : 5
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0
Packet ZZ
op_3b : 111
ctrl_2b : 01
ini_count : 545
Packet QEA
op_3b : 111
ctrl_2b : 11
ini_count : 0
And what I am trying to get is to have grep
or some other command to return the start of the line number of when these three line pattern occurs:
op_3b : 001
ctrl_2b : 00
ini_count : 0
Suppose that pattern is in file pattern
like this:
$ cat pattern
op_3b : 001
ctrl_2b : 00
ini_count : 0
Then, try:
$ awk '$0 ~ pat' RS= pat="$(cat pattern)" logfile
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0
RS=
This sets the Record Separator RS
to an empty string. This tells awk to use an empty line as the record separator.
pat="$(cat pattern)"
This tells awk to create an awk variable pat
which contains the contents of the file pattern
.
If your shell is bash, then a slightly more efficient form of this command would be pat="$(<pattern)"
. (Don't use this unless you are sure that your shell is bash.)
$0 ~ pat
This tells awk to print any record that matches the pattern.
$0
is the contents of the current record. ~
tells awk to do a match between the text in $0
and the regular expression in pat
.
(If the contents of pattern
had any regex active characters, we would need to escape them. Your current example does not have any so this is not a problem.)
Some people prefer a different style for defining awk variables:
$ awk -v RS= -v pat="$(cat pattern)" '$0 ~ pat' logfile
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0
This works the same.
$ awk -F'\n' '$0 ~ pat{print "Line Number="n+1; print "Packet" $0} {n=n+NF-1}' RS='Packet' pat="$(cat pattern)" logfile
Line Number=20
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0