Search code examples
sqlregexstringoracle-databasesubstr

Regexp or substr or another method to find a string


I want to reach the best performance and select a "string" only after the word "DL:"

I have a column (varchar2) with values:

    DL:1011909825
    Obj:020190004387 DL:8010406429
    Obj:020190004388 DL:8010406428
    DL:190682
    DL:PDL01900940
    Obj:020190004322 DL:611913067

so output be like:

    1011909825
    8010406429
    8010406428
    190682
    PDL01900940
    611913067

I'm not expert in regular expressions, but i tried regexp_replace:

regexp_replace(column,'Obj:|DL:','',1, 0, 'i')

It's almost OK, but the output is still not the same:

    1011909825
    020190004387 8010406429
    020190004388 8010406428
    190682
    PDL01900940
    020190004322 611913067

How can i solve this problem & reach the best performance ?


Solution

  • If data always looks like this, then SUBSTR + INSTR do the job:

    SQL> with test (col) as
      2    (
      3      select 'DL:1011909825' from dual union all
      4      select 'Obj:020190004387 DL:8010406429' from dual union all
      5      select 'Obj:020190004388 DL:8010406428' from dual union all
      6      select 'DL:190682' from dual union all
      7      select 'DL:PDL01900940' from dual union all
      8      select 'Obj:020190004322 DL:611913067' from dual
      9     )
     10  select col, substr(col, instr(col, 'DL:') + 3) result
     11  from test;
    
    COL                            RESULT
    ------------------------------ ------------------------------
    DL:1011909825                  1011909825
    Obj:020190004387 DL:8010406429 8010406429
    Obj:020190004388 DL:8010406428 8010406428
    DL:190682                      190682
    DL:PDL01900940                 PDL01900940
    Obj:020190004322 DL:611913067  611913067
    
    6 rows selected.
    
    SQL>
    

    REGEXP_SUBSTR might look like this:

     <snip>
     10  select col,
     11         ltrim(regexp_substr(col, 'DL:\w+'), 'DL:') resul
     12  from test;
    
    COL                            RESULT
    ------------------------------ -----------------------------
    DL:1011909825                  1011909825
    Obj:020190004387 DL:8010406429 8010406429
    Obj:020190004388 DL:8010406428 8010406428
    DL:190682                      190682
    DL:PDL01900940                 PDL01900940
    Obj:020190004322 DL:611913067  611913067
    

    If there's a lot of data, this should be way faster than regular expressions.