Search code examples
logstashelastic-stacklogstash-grok

Finding grok pattern for file with varying structure


I have a log file where not all the lines are in the same format. How do I find the correct grok pattern for such a file.

[15:37:20:030|1] [TdmUtil.c: 1534:fnTDM_LoadLocalFoo] F_LAA       : 1
[15:37:20:032|1] [TdmUtil.c: 1281:fnTDM_GetPreDef]  pdeGetData : MAX_IRAT_NBR_PER_SERVED_CELL_SYS = 256
[15:37:20:091|1] [TdmUtil.c:  293:fnTDM_PrtIndexKey] fnTDM_GetIndexKeyNum Error!!

In this way few of the lines are in the format of line1, few in the format of line2 and so on. I could write a grok pattern for each of these lines, but I have no idea how to combine them. Is there any way to solve this ?


Solution

  • I have put something together for you. but before I share it with you I suggest you to work with online GROK debugger in order to write your GROK pattern (there is 1 inside Kibana if you are working with it under Dev Tools -> GROK debugger). You should also check out the available GROK patterns.

    I see all 3 lines has the same prefix which is [time|num] [class: line number: function name] log text I have created a GROK patter for that. if you want to break down the log text further you can do so by uncomment the second match over the text field and provide the needed grok patter.

    NOTE: you can add as many more match sections as you want, but beware that it will try to run the match on all of them. try using if else statements to navigate through for high complexity- usually it is not needed.

    input {
        file {
            path => "C:/work/elastic/logstash-6.5.0/config/test.txt"
            start_position => "beginning"
            codec => multiline {
                pattern => "^\[%{TIME}\|"
                negate => true
                what => "previous"
            }
            type => "whatever"
        }
    }
    filter {
        if [type] == "whatever" {
            grok {
                break_on_match => false
                match => { "message" => "^\[%{TIME:time}\|%{NUMBER:num}\]%{SPACE}\[%{DATA:class}:%{SPACE}%{NUMBER:linenumber:int}:%{DATA:function}\]%{GREEDYDATA:text}$"}
                #match => { "text" => ""}
            }
        }
    }
    
    output {
        elasticsearch {
            hosts => ["http://localhost:9200"]
            index => "test"
        }
    }
    

    The above configuration file will provide you with the following fields in Kibana:

    enter image description here