Search code examples
bashparsingloggingjqtail

Parse JQ output through external bash function?


I want to parse out data out of a log file which consist of JSON sting and I wonder if there's a way for me to use a bash function to perform any custom parsing instead of overloading jq command.

Command:

tail errors.log --follow | jq --raw-output '. | [.server_name, .server_port, .request_file] | @tsv' 

Outputs:

8.8.8.8     80     /var/www/domain.com/www/public

I want to parse 3rd column to cut the string to exclude /var/www/domain.com part where /var/www/domain.com is the document root, and /var/www/domain.com/subdomain/public is the public html section of the site. Therefore I would like to leave my output as /subdomain/public (or from the example /www/public).

I wonder if I can somehow inject a bash function to parse .request_file column? Or how would I do that using jq?

I'm having issues piping out the output of any part of this command that would allow me to do any sort of string manipulation.


Solution

  • Use a BashFAQ #1 while read loop to iterate over the lines, and a BashFAQ #100 parameter expansion to perform the desired modifications:

    tail -f -- errors.log \
      | jq --raw-output --unbuffered \
           '[.server_name, .server_port, .request_file] | @tsv' \
      | while IFS=$'\t' read -r server_name server_port request_file; do
          printf '%s\t%s\t%s\n' "$server_name" "$server_port" "/${request_file#/var/www/*/}"
        done
    

    Note the use of --unbuffered, to force jq to flush its output lines immediately rather than buffering them. This has a performance penalty (so it's not default), but it ensures that you get output immediately when reading from a potentially-slow input source.


    That said, it's also easy to remove a prefix in jq, so there's no particular reason to do the above:

    tail -f -- errors.log | jq -r '
      def withoutPrefix: sub("^([/][^/]+){3}"; "");
      [.server_name, .server_port, (.request_file | withoutPrefix)] | @tsv'