Search code examples
regexlucenekibana

In Kibana, regex filters do not seem to be anchored at the field start and end, and it can't match a literal '...' sequence


I have a set of log entries in Kibana, one line per record.

I'm trying to filter out ones that look like this in the expanded log entry:

t dlog.line 53...

and like this in the log:

{"dlog":{ ..., "line":"53...\n", ...}, ...}

So it's a text record - nothing fancy - and I have a filter like this to try to match it, using Kibana's Lucene regexes, NOT PCRE:

{ "query": { "regexp": { "dlog.line": { "value": "[0-9][0-9]\\.\\.\\.\n?" } } } }

If I set it, it excludes everything. If I negate it, it includes everything. I have to double-escape the dots, or Kibana's EDIT FILTER won't let me save it.

If I fiddle with the regex, I can get it matching other things with dots - [0-9][0-9]\\..* makes it match numeric IP addresses and the hours.minutes parts of timestamps, but not my target lines. even though they start with two digits and a literal dot.

The assertion that it matches the whole field end to end seems to be untrue, and confusingly enough the entire IP address is highlighted in the discover table, as if Kibana's running the regex once to search for the matching records, and once again with different parameters to highlight the matches, but it's using different logic because... why does the highlight stop at colons in timestamps or the square bracket in the log line when it should be using .* to the end of the line?

t dlog.line localhost [127.0.0.1] 5538 (?) : Connection refused
                       ^^^^^^^^^ highlit
t dlog.line == STATUS: 2024-01-12.10:31:58 watchdog-awaiting-device-60s 5538
                               ^^^^^ highlit

Using [0-9][0-9].* does match my 58... lines but also a pile of others and the highlights don't light up the ... sequence in my target lines, but they do indicate a whole pile of other substrings in entries (the .* matches letters, the colon sometimes, . in numeric IP addresses, but not any of -],/ that I've seen)

t dlog.line host doesn't support requested feature: CPUID.80000001H:ECX.abm [bit 5]
                                                          ^^^^^^^^^^^^^^^^^

I've tried [0-9][0-9]\\\\.*\\\\n with any number of \ escapes that Kibana will let me put into its EDIT FILTER dialogue, and nothing seems to work.

I've looked through the extra flags I can set, but nothing looks relevant.

Help? I'm very familiar with PCRE regexes, but this Lucene/Kibana nonstandard regex stuff has eaten a lot of my time.


Solution

  • The field in question has a .keyword version.

    Counterintuitively, changing the regex to work against that instead made the regex work.