Search code examples
oracle11gregexp-replaceregexp-substr

oracle split by comma a quoted string


I have found tons of questions like this, but none of them are 100% suitable for me. i have oracle 11g express

I have that string

'abcd,123,,defoifcd,"comma, in string",87765,,,hello'

it means normally a comma separates the data, but could be empty data (even more in a series), and if there is comma in the data it is quoted.

so far the best reg exp is that one

'("[^"]*"|[^,]+)'

but this puts at the end of all of the empty data using that query

with test as 
(select 
'abcd,123,,defoifcd,"comma, in string", 87765,,,hello'
str from dual  
)  
select REGEXP_SUBSTR(str, '("[^"]*"|[^,]+)', 1, rownum) split
from test  
connect by level <= length (regexp_replace (str, '("[^"]*"|[^,]+)'))  + 1;

i also tried replacing the empty data with ,n/a, so to have

'abcd,123,n/a,defoifcd,"comma, in string",87765,n/a,n/a,hello'

but regexp_replace replaces only the first occurence of the empty data

select 
regexp_replace('abcd,123,,defoifcd,"comma, in string",87765,,,hello', '(,,)', ',n/a,')
str from dual;  

thanks in advance!


Solution

  • This seems to work and handles NULLs:

    SQL> with test as
        (select
        'abcd,123,,defoifcd,"comma, in string", 87765,,,hello'
        str from dual
        )
        select trim('"' from REGEXP_SUBSTR(str, '(".*?"|.*?)(,|$)', 1, level, NULL, 1)) split
        from test
        connect by level<=length(regexp_replace(str,'".*?"|[^,]*'))+1;
    
    SPLIT
    ----------------------------------------------------
    abcd
    123
    
    defoifcd
    comma, in string
     87765
    
    
    hello
    
    9 rows selected.
    
    SQL>
    

    This post provided the impetus for the solution: https://community.oracle.com/thread/528107?tstart=0