Search code examples
logstashlogstash-grok

Grok can't create optional fields


I'm trying to parse logs in logstash with use of grok. In error log there are two more optional fields like requestUrl and requestMethod. but when i use grok for parsing logs to logstash.these from the error log grok can't create these two fields.

mypipeline.conf :

input {
 beats {
    type=>"mytest"
    port => 5044
     
  }
} 
filter{
    if [fields][log_type] == "gbase"
    {
       if [level] in [ "Error", "Fatal" ] 
        {
            grok { match=> ["message","%{TIMESTAMP_ISO8601:timestamp} %{INT:processId} %{LOGLEVEL:level} %{DATA:logger} %{USER:user} %{IPV4:clientIp} %{URI:requestUrl} %{USER:requestMethod} %{GREEDYDATA:message}"]
            overwrite => [ "message" ]}
                mutate { gsub => ["message", "\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{4} ",""]}
                mutate { gsub => ["message", "%{level}",""]}
                mutate { gsub => ["message", "%{logger}",""]}
                mutate { gsub => ["message", "%{clientIp}",""]}
                mutate { gsub => ["message", "%{processId}",""]}
        }
        else
        {
            grok { match=> ["message","%{TIMESTAMP_ISO8601:timestamp} %{INT:processId} %{LOGLEVEL:level} %{DATA:logger} %{USER:user} %{IPV4:clientIp} %{GREEDYDATA:message}" ]
            overwrite => [ "message" ]}
                mutate { gsub => ["message", "\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{4} ",""]}
                mutate { gsub => ["message", "%{level}",""]}
                mutate { gsub => ["message", "%{logger}",""]}
                mutate { gsub => ["message", "%{clientIp}",""]}
                mutate { gsub => ["message", "%{processId}",""]}
        }
        
    }
    if [fields][log_type] == "finance" 
    {
       if [level] in [ "Error", "Fatal" ] 
        {
            grok { match=> ["message","%{TIMESTAMP_ISO8601:receivedAt} \[%{INT:processId}]\ %{LOGLEVEL:level} %{DATA:logger} %{USER:user} %{IPV4:clientIp} %{URI:requestUrl} %{WORD:requestMethod} %{GREEDYDATA:message}"]
            overwrite => [ "message" ]}
                mutate { gsub => ["message", "\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{4} ",""]}
                mutate { gsub => ["message", "%{level}",""]}
                mutate { gsub => ["message", "%{logger}",""]}
                mutate { gsub => ["message", "%{clientIp}",""]}
                mutate { gsub => ["message", "%{processId}",""]}
        }
        else
        {
            grok { match=> ["message","%{TIMESTAMP_ISO8601:receivedAt} \[%{INT:processId}]\ %{LOGLEVEL:level} %{DATA:logger} %{USER:user} %{IPV4:clientIp} %{GREEDYDATA:message}"]
        overwrite => [ "message" ]}
                mutate { gsub => ["message", "\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{4} ",""]}
                mutate { gsub => ["message", "%{level}",""]}
                mutate { gsub => ["message", "%{logger}",""]}
                mutate { gsub => ["message", "%{clientIp}",""]}
                mutate { gsub => ["message", "%{processId}",""]}
        }
        
    
    }
   date {
        match => [ "timestamp", "yyyy-MM-dd HH:mm:ss.SSS" ]
        #match => [ "time" , "dd/MMM/yyyy:HH:mm:ss Z" ]
        target=> "@timestamp"
    }
}
output {
 
    if [fields][log_type] == "finance"
    {
        elasticsearch 
        {
            hosts => ["http://localhost:9200"] 
            index => "finance-%{+YYYY.MM.dd}"
            user => "something"
            password => "something"     
        }
    } 
    if [fields][log_type] == "gbase"
    {
        elasticsearch 
        {
            hosts => ["http://localhost:9200"] 
            index => "gbase-%{+YYYY.MM.dd}"
            user => "something"
            password => "something" 
        }
    }
  stdout { codec => rubydebug }
 }

filebeat.yml

filebeat.inputs:
- type: log 
  enabled: true
  paths:
    - D:\Git\gbase.API\API\Logs\*.log
  tags: ["gbaseapi"]
  fields: 
    log_type: gbase   
  multiline.pattern:  ^\d\d\d\d-\d\d-\d\d
  multiline.negate: true
  multiline.match: after
  multiline.max_lines : 1000

- type: log 
  enabled: true
  paths:
    - D:\Git\finance.api\FinanceAPI\logs\*.log
  tags: ["financeapi"]
  fields: 
    log_type: finance
  multiline.pattern:  ^\d\d\d\d-\d\d-\d\d
  multiline.negate: true
  multiline.match: after    
  multiline.max_lines : 1000 

my log sample :

2020-10-10 21:14:59.8746 22152 WARN Validation bhavin 192.168.43.244 Sensitive data logging is enabled. Log entries and exception messages may include sensitive application data, this mode should only be enabled during development. 
2020-10-10 21:15:35.2745 22152 DEBUG QuartzSchedulerThread bhavin 192.168.43.244 Batch acquisition of 0 triggers 
2020-10-10 21:15:35.2783 22152 ERROR DeveloperExceptionPageMiddleware bhavin 192.168.43.244 http://bhavin/Leads/New POST An unhandled exception has occurred while executing the request.System.NullReferenceException: Object reference not set to an instance of an object.

json document

{
         "input" => {
        "type" => "log"
    },
     "processId" => "34076",
           "log" => {
          "file" => {
            "path" => "D:\\Git\\gbase.API\\API\\Logs\\log-2020-10-10.log"
        },
         "flags" => [
            [0] "multiline"
        ],
        "offset" => 43633
    },
      "clientIp" => "192.168.43.244",
        "fields" => {
        "log_type" => "gbase"
    },
     "timestamp" => "2020-10-10 21:29:00.1492",
           "ecs" => {
        "version" => "1.5.0"
    },
          "user" => "bhavin",
         "agent" => {
                "name" => "Bhavin",
            "hostname" => "Bhavin",
                "type" => "filebeat",
             "version" => "7.9.2",
                  "id" => "b5decaac-61a3-4dbc-860b-4423a0379391",
        "ephemeral_id" => "208d7c9d-bdcd-45e9-8f82-ccdb08292203"
    },
         "level" => "ERROR",
       "message" => "http://bhavin/Leads/New POST An unhandled exception has occurred while executing the request.GurukulERP.Logic.InvalidDataException: Provide a valid PhoneNumber.\n   at GurukulERP.API.Controllers.CRM.LeadsController.New(AdmissionLead model) in D:\\Git\\gbase.API\\API\\Controllers\\CRM\\LeadsController.cs:line 515\n   at lambda_method(Closure , Object , Object[] )\n   at Microsoft.Extensions.Internal.ObjectMethodExecutor.Execute(Object target, Object[] parameters)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.SyncObjectResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeActionMethodAsync()\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeNextActionFilterAsync()\n--- End of stack trace from previous location where exception was thrown ---\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeInnerFilterAsync()\n--- End of stack trace from previous location where exception was thrown ---\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|24_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeFilterPipelineAsync()\n--- End of stack trace from previous location where exception was thrown ---\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)\n   at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)\n   at Microsoft.AspNetCore.Session.SessionMiddleware.Invoke(HttpContext context)\n   at Microsoft.AspNetCore.Session.SessionMiddleware.Invoke(HttpContext context)\n   at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)\n   at Microsoft.AspNetCore.ResponseCompression.ResponseCompressionMiddleware.Invoke(HttpContext context)\n   at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)\n   at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)\n   at Microsoft.AspNetCore.Diagnostics..Invoke(HttpContext context)    at GurukulERP.API.Controllers.CRM.LeadsController.New(AdmissionLead model) in D:\\Git\\gbase.API\\API\\Controllers\\CRM\\LeadsController.cs:line 515\n   at lambda_method(Closure , Object , Object[] )\n   at Microsoft.Extensions.Internal.ObjectMethodExecutor.Execute(Object target, Object[] parameters)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.SyncObjectResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeActionMethodAsync()\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeNextActionFilterAsync()\n--- End of stack trace from previous location where exception was thrown ---\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeInnerFilterAsync()\n--- End of stack trace from previous location where exception was thrown ---\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|24_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeFilterPipelineAsync()\n--- End of stack trace from previous location where exception was thrown ---\n   at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)\n   at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)\n   at Microsoft.AspNetCore.Session.SessionMiddleware.Invoke(HttpContext context)\n   at Microsoft.AspNetCore.Session.SessionMiddleware.Invoke(HttpContext context)\n   at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)\n   at Microsoft.AspNetCore.ResponseCompression.ResponseCompressionMiddleware.Invoke(HttpContext context)\n   at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)\n   at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)\n   at Microsoft.AspNetCore.Diagnostics..Invoke(HttpContext context)",
          "host" => {
                "name" => "Bhavin",
            "hostname" => "Bhavin",
        "architecture" => "x86_64",
                 "mac" => [
            [0] "04:92:26:4a:aa:7b",
            [1] "52:3e:aa:6a:59:4c",
            [2] "50:3e:aa:6a:59:4c",
            [3] "00:ff:73:c0:8e:ca",
            [4] "50:3e:aa:6a:59:4c"
        ],
                  "id" => "2540a614-ed14-468c-82b6-3c2f1de5b687",
                  "ip" => [
            [ 0] "fe80::8511:1aea:78ea:8722",
            [ 1] "169.254.135.34",
            [ 2] "fe80::a1dc:1a7d:8369:4ee5",
            [ 3] "169.254.78.229",
            [ 4] "fe80::c9b5:254f:10b3:191d",
            [ 5] "169.254.25.29",
            [ 6] "fe80::ac32:76b1:509a:bc5e",
            [ 7] "169.254.188.94",
            [ 8] "2405:204:808b:8dbb:d456:182e:dd87:ca6",
            [ 9] "2405:204:808b:8dbb:3132:508b:53c1:132e",
            [10] "fe80::d456:182e:dd87:ca6",
            [11] "192.168.43.244"
        ],
                  "os" => {
                "name" => "Windows 10 Enterprise",
            "platform" => "windows",
              "kernel" => "10.0.18362.1082 (WinBuild.160101.0800)",
             "version" => "10.0",
              "family" => "windows",
               "build" => "18362.1082"
        }
    },
      "@version" => "1",
          "type" => "mytest",
    "@timestamp" => 2020-10-10T15:59:02.229Z,
          "tags" => [
        [0] "gbaseapi",
        [1] "beats_input_codec_plain_applied",
        [2] "_dateparsefailure"
    ],
        "logger" => "DeveloperExceptionPageMiddleware"
}

here is image of kibana view :

Untitled.png

you can see the other fields are created but these two are not create


Solution

  • The field level does not exist in your document when it enters your pipeline, so the conditional you are using in your grok will never match to true.

    Looking at your example messages, they have basically the same structure, they only start to be different after the ip address, you can use just one grok to parse all those messages and then have another grok that will only be applied if the overwritten message field starts with an URI.

    The following filter will parse all of your example messages.

    filter {
        grok {
            match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{INT:processId} %{LOGLEVEL:level} %{DATA:logger} %{USER:user} %{IPV4:clientIp} %{GREEDYDATA:message}" }
            overwrite => "message"
        }
    
        if [message] =~ "^http:" {
            grok {
                match => { "message" => "%{URI:requestUrl} %{USER:requestMethod} %{GREEDYDATA:message}" }
                overwrite => "message"
            }
        }
    }
    

    The first grok will parse the messages and overwrite the message field with everything the comes after the ip address. So, in the case of the message that you want to extract the requestUrl and requestMethod, you will have the message field with the following value.

    http://bhavin/Leads/New POST An unhandled exception has occurred while executing the request.System.NullReferenceException: Object reference not set to an instance of an object.
    

    The if [message] =~ "^http:" part will check if the message field starts with http:, and if true it will parse this message, which will then create the fields requestUrl and requestMethod and a new overwritten message field.

    Also, your gsub filters makes no sense, you can remove it.

    You can apply the same process to your finance log type, the only difference seems to be that the processId appears between square brackets and you are using another field for the timestamp.