Search code examples
regexnginxfluentd

How to match optional group in regular expression


I want a regular expression for using in fluentd for parsing nginx error logs.

The sample row is:

2024/04/15 09:06:29 [error] 3443790#3443790: *176070165 limiting requests, excess: 2.957 by zone "RequestLimitForCommonApi", client: 77.81.151.129, server: test.com, request: "POST /capi/session/forgot HTTP/1.1", host: "test.com", referrer: "https://test.com/"

I'm using the following format for matching log parameters:

format1 /^(?<time>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?<log_level>\w+)\] (?<pid>\d+).(?<tid>\d+): (?<error>.*), (?<client>.*), (?<server>.*), (?<request>.*), (?<host>.*), (?<referrer>.*)/

But some log rows have 'uptime' parameter and some of them don't have.

Now what regular expression should I use to also match the 'uptime' parameter value if exists?

Sample log row with uptime:

2024/04/15 02:01:32 [error] 3443790#3443790: *172976982 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 86.55.16.251, server: test.com, request: "POST /api/test HTTP/1.1", upstream: "http://127.0.0.1:30110/api/test", host: "test.com", referrer: "https://test.com/"

Solution

  • you can try this code to match client an request for example and if there was another phrase like "test" between them, it could be ignored.

    client: (?[^,]+),( test: )?((?[^,]+),)? request: (?[^,]+)

    as you noticed, "test:" and which comes after it is optional.