I have a log file containing lines about different users, and I'm tailing this file in real time. I want to filter out the lines that are only related to a user that I specify, ex: 1234. The log entries look like this:
ID:101 Username=1234
ID:102 Username=1234
ID:999 UNWANTED LINE (because this ID was not assigned to user 1234)
ID:102 some log entry regarding the same user
ID:123 UNWANTED LINE (because this ID was not assigned to user 1234)
ID:102 some other text
ID:103 Username=1234
ID:103 blablabla
A dynamic ID is assigned to a user in a line like "ID:101 Username=1234". Any subsequent lines that start with that ID pertain to the same user and will need to be displayed. I need a dynamic tail that would get all IDs related to the specified user (1234) and filter the previous lines as follows:
ID:101 Username=1234
ID:102 Username=1234
ID:102 some log entry regarding the same user
ID:102 some other text
ID:103 Username=1234
ID:103 blablabla
I need to first filter the lines where "Username=1234" is found, then extract the "ID:???" from that line, then tail all lines that contain "ID:???". When another line with "Username=1234" is found, extract the new ID and use it to display the subsequent lines with this new ID.
I am able to chain greps to filter out the ID when I use cat, but it doesn't work when I chain them after a tail. But even if I could, how do I "watch" for a new value of ID and dynamically update my grep pattern???
Thanks in advance!
This is a task that Awk can handle with ease (and it could be handled with Perl or Python too).
awk '$2 == "Username=1234" { ids[$1]++; } $1 in ids { print }' data
The first pattern/action pair records the ID:xxx
value for an entry where $2
is Username=1234
in the array ids
. The second pattern/action pair looks whether the ID:xxx
entry is listed in ids
; if so, it prints the line. The Username=1234
lines satisfy both criteria (at least, after the entry is added to the array).
How do I use it so it can act like
tail
(i.e. print the new lines as they're added to data)?
tail -f logfile | awk …
You'd miss the name of the data file from the awk
part of the command, of course. The only thing you'd have to watch for is that tail doesn't hang-up waiting to fill the pipe buffer. It probably won't be a problem, but you might have to look hard at the options to tail
if it takes longer for lines to appear in the Awk input than you expected.
I realized that ID:XXX doesn't necessarily always come at position $1... is there a way to match the ID with a regular expression regardless of its position in the line ($1, $2, ...)?
Yes:
awk '$2 == "Username=1234" { ids[$1]++; }
{ for (i = 1; i <= NF; i++) if ($i in ids) { print; break }' data
The second line matches every line, and for each field in the line, checks whether that field is present in ids
array. If it is, it prints the line and breaks out of the loop (you could use next
instead of break
in this context, though the two are not equivalent in general).