awk filter out messages/rules and display number of merged messages

For the company that I work for, I want to filter out certain messages in a (actually:2) logfile.

These messages are just informative and is not particularly useful when troubleshoot for errors/faults.

After long delibiration (I also posted a similar question, but for Windows and it's PS/BS (certain kind of "cow dung" ;) )

I think AWK is suitable for the job, and I have a shell script made. However, it isn't running (expected). Can somebody help me "filling in the blanks"?

#!/bin/bash

## URL that could have been the answer (but not quite)    https://stackoverflow.com/questions/10842118/explain-this-duplicate-line-removing-order-retaining-one-line-awk-command



###To sort by what you WANT to see:
##e.g awk '/term to search/' dpkg.log

#if 
#    $var_show awk '/installed/' syslog/dpkg.log
#    then
#    printf('$var_show')
#fi







##Show what DONT want to see.
if
    #$var_notshow awk /'what not to display'/ syslog/dpkg.log
    $var_notshow awk /'Status Installed'/ dpkg.log
then
wc -1 > $var_notshow 
#echo number of merged messages (of the same content): xxx merged messages #< is the amount 
echo Messages of Status installed: $var_notshow were merged
fi 
###!!Show the amount of rules (when the same rule/logged event) that were merged
## E.g. (multiple lines which state: "Status Installed: xxxxxxxxxxxxx" ) and display it as: "Messages of Status Installed: xxxx were merged. Totalling: # of messages (of Status Installed)" were merged. 

### Finally, save it in a different file.
#Like: ?? how to do that?

So, basically I got two ways of doing this: filter what you want or what you don't want. It might be nicer/cleaner to start filtering out messages I don't want.

And yes, as a proof of concept, I used a standard logfile in my testmachine. I can convert it to the company specific information...

Excerpt from log file:

 11:56:31 status half-configured grep:amd64 3.1-2
 11:56:32 status installed grep:amd64 3.1-2
 11:56:32 configure debconf:all 1.5.66 <none>
 11:56:32 status unpacked debconf:all 1.5.66
 11:56:32 status unpacked debconf:all 1.5.66
 11:56:32 status unpacked debconf:all 1.5.66
 11:56:32 status half-configured debconf:all 1.5.66
 11:56:32 status installed debconf:all 1.5.66
 11:56:32 configure gzip:amd64 1.6-5ubuntu1 <none>
 11:56:33 status half-configured util-linux:amd64 2.31.1-0.
 11:56:34 status installed util-linux:amd64 2.31.1-0.4ubuntu3
 11:56:34 configure libpam-modules-bin:amd64 1.1.8-3.6ubuntu2 <none>
 11:56:34 status unpacked libpam-modules-bin:amd64 1.1.8-3.6ubuntu2
 11:56:34 status half-configured libpam-modules-bin:amd64 1.1.8-3.6ubuntu2
 11:56:34 status installed libpam-modules-bin:amd64 1.1.8-3.6ubuntu2
 11:56:34 configure mount:amd64 2.31.1-0.4ubuntu3 <none>
 11:56:34 status unpacked mount:amd64 2.31.1-0.4ubuntu3
 11:56:34 status half-configured mount:amd64 2.31.1-0.4ubuntu3
 11:56:34 status installed mount:amd64 2.31.1-0.4ubuntu3
 11:56:34 configure procps:amd64 2:3.3.12-3ubuntu1 <none>
 11:56:34 status unpacked procps:amd64 2:3.3.12-3ubuntu1
 11:56:34 status unpacked procps:amd64 2:3.3.12-3ubuntu1
 11:56:34 status unpacked procps:amd64 2:3.3.12-3ubuntu1

Thanks in advance :)

Thomas

Solution

Show all interesting lines:

grep interesting file

Show all except uninteresting lines:

grep -v "Status uninteresting" file

Count with awk:

awk '/uninteresting/{n++}END{print "uninteresting messages: "n}'

Redirect command output into a new file:

grep interesting file | grep -v uninteresting > newFile

Or append to newFile:

grep interesting file | grep -v uninteresting >> newFile

Do everything at once:

awk '/uninteresting/{u++;next}/interesting/{print}END{print "uninteresting lines: "u}'
this is interesting
uninteresting lines: 1