Search code examples
elasticsearchlogstashlogstash-grok

Grok filter for logstash to match a specific value from a log file


I have the following log:

2018-10-30 11:47:52 INFO     30464 SMS-MT [cid:300038] [queue-msgid:bb7a195d-fb23-42ae-bbfa-d2dcda405af9] [smpp-msgid:j.11082.639364178944.#MARKET SETU] [status:ESME_ROK] [prio:1] [dlr:NO_SMSC_DELIVERY_RECEIPT_REQUESTED] [validity:none] [from:2323232] [to:23232132312] [content:'#MARKET SETUP\nadsadadadadasdasdadaasdada mo ang:\nC jean_rivera\n--Mag reply ng A-C']

I've created a grok filter based on pattern in logstash so I can parse the log the way I want. And I have this:

%{DATESTAMP:Timestamp} %{LOGLEVEL:Level}     %{BASE10NUM:Pid} %{USERNAME:SMS_TYPE} %{CID:CID} %{GREEDYDATA:Message}

I'm trying to create a GROK patter that will match 300038, which is the number coming after cid:. The syntax is always the same, [cid:number]. What I have now is:

    CID (\[cid:[0-9]{6}\])
but that results into: 
"CID": [
    [
      "[cid:300038]"
    ]
  ],

and I only want to match the 300038, without the [cid:] part


Solution

  • I have noticed that there are more than single space character between LOG and pid, you can match all of them using \s*.

    To match just a number from [cid:300038] you can use custom pattern, \[cid:(?<CID>[0-9]{1,})\] this will match cid of any length, not just 6 digits.

    Your pattern will become,

    %{DATESTAMP:Timestamp} %{LOGLEVEL:Level}\s*%{BASE10NUM:Pid} %{USERNAME:SMS_TYPE} \[cid:(?<CID>[0-9]{1,})\] %{GREEDYDATA:Message}