Search code examples
sqlregexoracle-databaseregexp-replaceregexp-substr

Fetching value from Pipe-delimited String using Regex (Oracle)


I have a sample source string like below, which was in pipe delimited format in that the value obr can be at anywhere. I need to get the second value of the pipe from the first occurrence of obr. So for the below source strings the expected would be,

Source string:

select 'asd|dfg|obr|1|value1|end' text from dual
union all
select 'a|brx|123|obr|2|value2|end' from dual
union all
select 'hfv|obr|3|value3|345|pre|end' from dual

Expected output:

value1
value2
value3

I have tried the below regexp in oracle sql, but it is not working fine properly.

with t as (
            select 'asd|dfg|obr|1|value1|end' text from dual
            union all
            select 'a|brx|123|obr|2|value2|end' from dual
            union all
            select 'hfv|obr|3|value3|345|pre|end' from dual
            )
            select text,to_char(regexp_replace(text,'*obr\|([^|]*\|)([^|]*).*$', '\2')) output from t;

It is working fine when the string starts with OBR, but when OBR is in the middle like the above samples it is not working fine.

Any help would be appreciated.


Solution

  • This handles null elements and is wrapped in a NVL() call which supplies a value if 'obr' is not found or occurs too far toward the end of a record so a value 2 away is not possible:

    SQL> with t(id, text) as (
         select 1, 'asd|dfg|obr|1|value1|end'      from dual
         union
         select 2, 'a|brx|123|obr|2|value2|end'    from dual
         union
         select 3, 'hfv|obr|3|value3|345|pre|end'  from dual
         union
         select 4, 'hfv|obr||value4|345|pre|end'   from dual
         union
         select 5, 'a|brx|123|obriem|2|value5|end' from dual
         union
         select 6, 'a|brx|123|obriem|2|value6|obr' from dual
       )
       select
         id,
         nvl(regexp_substr(text, '\|obr\|[^|]*\|([^|]*)(\||$)', 1, 1, null, 1), 'value not found') value
       from t;
    
            ID VALUE
    ---------- -----------------------------
             1 value1
             2 value2
             3 value3
             4 value4
             5 value not found
             6 value not found
    
    6 rows selected.
    
    SQL>
    

    The regex basically can be read as "look for a pattern of a pipe, followed by 'obr', followed by a pipe, followed by zero or more characters that are not a pipe, followed by a pipe, followed by zero or more characters that are not a pipe (remembered in a captured group), followed by a pipe or the end of the line". The regexp_substr() call then returns the 1st captured group which is the set of characters between the pipes 2 fields from the 'obr'.