Search code examples
regexoracleoracle10g

Regex matching works on regex tester but not in oracle


I have the regex \bname[^a-zA-Z]+[0-9]+. This regex works on https://regexr.com/ but does not work in oracle. The pattern I am going for is: <exact word "name" (upper or lower case)><any character/s that is a non-alphabet (including line breaks) or it can be no character><a whole number>

Where is my testing sample:

name8213

name:1232

name: 234

name 
1231

name: 
985

name:, 123

-- Should not match any of the below text

nameis1233

name is 123

ornaments are cool 360

nickname 1323

name 1234 1233 (should not match the second set of numbers)

However, when I execute

SELECT REGEXP_SUBSTR('name 123', '\bname[^a-zA-Z]+[0-9]+', 1, 1, 'i') FROM DUAL

I get nothing out. My end goal is to just extract that whole number.


Solution

  • Oracle regex does not support word boundaries.

    In this case, you need to do three things:

    • Use the (\W|^) as the leading word boundary construct substitute and
    • Wrap the part of the regex you need to get as a return with another capturing group, and later get it using the right argument to the REGEXP_SUBSTR function
    • Replace the [^a-zA-Z]+ with \W* (or [^[:alnum:]]* if you allow underscores there), as your intention is to match any non-word chars between name and a number, and fail all matches with letters and digits in between them.

    You can use

    SELECT REGEXP_SUBSTR('name 123', '(\W|^)(name\W*[0-9]+)', 1, 1, 'i', 2) FROM DUAL
    

    The last argument, 2, tells the REGEXP_SUBSTR to fetch the value of the second capturing group.

    Details:

    • (\W|^) - Group 1: a non-word char (any char other than a letter, digit or underscore) or start of string
    • (name\W*[0-9]+) - Group 2: name, then any zero or more non-word chars (or any zero or more non-alphanumeric chars if you use [^[:alnum:]]*)and then one or more digits.