I want to get the second occurrence of the matching pattern (inside the brackets) by using a regex. Here is the text
[2019-07-29 09:48:11,928] @hr.com [2] [AM] WARN
I want to extract 2 from this text.I tried using
(?<Ten ID>((^)*((?<=\[).+?(?=\]))))
But it matches 2019-07-29 09:48:11,928 , 2 , AM. How to get only 2 ?
To get a substring between [
and ]
(square brackets) excluding the brackets you may use /\[([^\]\[]*)\]/
regex:
\[
- a [
char([^\]\[]*)
- Capturing group 1: any 0+ chars other than [
and ]
\]
- a ]
char.To get the second match, you may use
str = '[2019-07-29 09:48:11,928] @hr.com [2] [AM] WARN'
p str[/\[[^\]\[]*\].*?\[([^\]\[]*)\]/m, 1]
See this Ruby demo. Here,
\[[^\]\[]*\]
- finds the first [...]
substring.*?
- matches any 0+ chars as few as possible\[([^\]\[]*)\]
- finds the second [...]
substring and captures the inner contents, returned with the help of the second argument, 1
.To get Nth match, you may also consider using
str = '[2019-07-29 09:48:11,928] @hr.com [2] [AM] WARN'
result = ''
cnt = 0
str.scan(/\[([^\]\[]*)\]/) { |match| result = match[0]; cnt +=1; break if cnt >= 2}
puts result #=> 2
See the Ruby demo
Note that if there are fewer matches than you expect, this solution will return the last matched substring.
Another solution that is not generic and only suits this concrete case: extract the first occurrence of an int number inside square brackets:
s = "[2019-07-29 09:48:11,928] @hr.com [2] [AM] WARN"
puts s[/\[(\d+)\]/, 1] # => 2
See the Ruby demo.
To use the regex in Fluentd, use
\[(?<val>\d+)\]
and the value you need is in the val
named group. \[
matches [
, (?<val>\d+)
is a named capturing group matching 1+ digits and ]
matches a ]
.
Fluentular shows:
Copy and paste to
fluent.conf
ortd-agent.conf
type tail path /var/log/foo/bar.log pos_file /var/log/td-agent/foo-bar.log.pos tag foo.bar format /\[(?\d+)\]/Records
Key Value val 2