Search code examples
regexoracle-databaseplsql

Oracle Regex - Splitting a string by more than 1 delimiter


I'm looking to currently split a string in Oracle by more than 1 character.

Currently I'm splitting a set of numbers by a space like so:

SELECT regexp_substr('123 456', '(^| )([^ ]*)') from dual
SELECT regexp_substr('123 456', '(^| )([^ ]*)', 1, 2, null, 2) from dual

etc. which returns back 123 and then 456.

However I'm looking to have a scenario where I could input "123 456-789" and be able to split the numbers into "123" "456" "789".

I've tried

(^| )([^ ]*)|^[^-]*[^ -]

as my new regex expression but no joy.


Solution

  • You can use:

    SELECT value,
           REGEXP_SUBSTR(value, '(^|[ -])([^ -]*)', 1, 1, null, 2) AS first,
           REGEXP_SUBSTR(value, '(^|[ -])([^ -]*)', 1, 2, null, 2) AS second,
           REGEXP_SUBSTR(value, '(^|[ -])([^ -]*)', 1, 3, null, 2) AS third
    FROM   table_name
    

    Or, more simply, don't look for delimiters, just look for numbers:

    SELECT value,
           REGEXP_SUBSTR(value, '\d+', 1, 1) AS first,
           REGEXP_SUBSTR(value, '\d+', 1, 2) AS second,
           REGEXP_SUBSTR(value, '\d+', 1, 3) AS third
    FROM   table_name
    

    Which, for the sample data:

    CREATE TABLE table_name (value) AS
    SELECT '123 456'     FROM DUAL UNION ALL
    SELECT '123-456 789' FROM DUAL;
    

    Both output:

    VALUE FIRST SECOND THIRD
    123 456 123 456 null
    123-456 789 123 456 789

    However, if you only ever have single character delimiters, you can use simple string functions (which is more to type but may be more efficient):

    SELECT value,
           CASE
           WHEN sep1 > 0 THEN SUBSTR(value, 1, sep1 - 1)
           ELSE value
           END AS first,
           CASE
           WHEN sep2 > 0 THEN SUBSTR(value, sep1 + 1, sep2 - sep1 - 1)
           WHEN sep1 > 0 THEN SUBSTR(value, sep1 + 1)
           END AS second,
           CASE
           WHEN sep2 > 0 THEN SUBSTR(value, sep2 + 1)
           END AS third
    FROM   (
      SELECT value,
             normalised_value,
             INSTR(normalised_value, ' ', 1, 1) AS sep1,
             INSTR(normalised_value, ' ', 1, 2) AS sep2
      FROM   (
        SELECT value,
               TRANSLATE(value, ' -', '  ') AS normalised_value
        FROM   table_name
      )
    )
    

    Which has the same output.

    fiddle