Search code examples
elasticsearchinternationalizationlogstashgroklogstash-grok

Split Logstash/grok pattern that has international characters


Running into this issue.

I need to split up urls to get values from them. This works great when its all english.

URL = /78965asdvc34/Test/testBasins

Pattern = /%{WORD:org}/(?i)test/%{WORD:name}

I get this in the grok debugger. {"org":[["78965asdvc34"]],"name":[["testBasins"]]}

If I have international characters, grok does not read them with the pattern above.

/78965asdvc34/Test/浸水Basins

Any thoughts how to get this to work? This value can be in any language in the logs, and hopefully there is a way to get it out.


Solution

  • Have you already tried

    /%{WORD:org}/(?i)test/%{GREEDYDATA:name}

    From hurb.

    Thanks Hurb. GREEDYDATA worked.