I have a field in a database table in the format:
111_2222_33333,222_444_3,aaa_bbb_ccc
This is format is uniform to the entire field. Three underscore separated numeric values, a comma, three more underscore separated numeric values, another comma and then three underscore separated text values. No spaces in between
I want to extract the middle value from the second numeric sequence, in the example above I want to get 444
In a SQL query I inherited, the regex used is ^.,(\d+)_.$
but this doesn't seem to do anything.
I've tried to identify the first comma, first number after and the following underscore ,222_
to use as a starting point and from there get the next number without the _
after it
This (,\d*_)(\d+[^_])
selects ,222_444
and is the closest I've gotten
We can try using REGEXP_REPLACE
with a capture group:
SELECT
REGEXP_REPLACE(
'111_2222_33333,222_444_3,aaa_bbb_ccc',
'^[^,]+,[^_]+_(.*?)_[^_]+,.*$',
'\1') AS num
FROM yourTable;
Here is a demo showing that the above regex' first capture group contains the quantity you want.