Search code examples
linuxscriptinggreppipe

How to use “grep -f file-with-patterns” in a pipe where “file-with-patterns” changes frequently?


I am trying to do dynamic logging of dropped firewall events whereby I can dynamically specify what events not to log.

I use:

logread -f | grep -v -f file-with-patterns >> logfile

in a script running in the background.

This works fine, it logs everything except those events that I don’t want to be logged by specifying patterns in the file “file-with-patterns”.

However, it appears that when I update the file “file-with-patterns”, grep does not re-read it.

After a change to the file “file-with-patterns” I need to kill and restart the scripts starting all of this which is a bit cumbersome.

Can this be made to work without restarting the script? For instance crontab rereads the crontab file automatically when it changes. Is there some feature in grep (or an equivalent utility) that works like that?

Tried changing the file-with-patterns, grep did not change its filtering.


Solution

  • You could replace grep -v -f file-with-patterns with something like this, untested, using any awk that supports delete array (which is all of them these days AFAIK):

    awk -v exceptionsFile='file-with-patterns' '
        function readExceptions(    exception) {
            delete exceptions
            while ( (getline exception < exceptionsFile) > 0 ) {
                exceptions[exception]
            }
            close(exceptionsFile)
        }
        {
            readExceptions()
            for ( exception in exceptions ) {
                if ( $0 ~ exception ) {
                    next
                }
            }
            print
        }
    '
    

    That will re-read the file of exceptions every time it receives a line of input before it processes that line of input by comparing it against every exception as a regexp.

    You could, of course, introduce a counter so it only calls readExceptions() every N lines of input or do something else to reduce how often it's called but as long as your file of exceptions isn't massively long that function will run in the blink of an eye so it's probably fine to call it for every line of input.

    By the way, there may be a much more efficient way to do the above (e.g. a hash lookup instead of a loop comparing regexps), depending on the contents of your exceptions file, the output of that logread tool and what you're really trying to match on - post a new question with sample input/output if you'd like help with that.