Search code examples
sqloracle-databaseoracle12cxmltable

Escaping characters for use with Oracle's Xmltable


I'm using Xmltable to convert a field of comma-delimited email addresses to a table of values.

WITH

data AS
  (
    select  1 ID, 'foo&bar@domain.tld,bar@domain.tld' recipients from dual
  )

select  ID, trim(COLUMN_VALUE) recipient
from    data,xmltable(('"'|| REPLACE( recipients , ',', '","') || '"'))

produces an error:

[72000][19112] ORA-19112: error raised during evaluation: XVM-01003: [XPST0003] Syntax error at '"foo' 1 "foo&bar@domain.tld","bar@domain.tld" - ^

However, when I replace the & with its entity value (&):

WITH

DATA AS
  (
    select  1 ID, 'foo&bar@domain.tld,bar@domain.tld' recipients from dual
  )

select  ID
        -- & --> &
        , replace( trim(COLUMN_VALUE), '&', '&') recipient
from    data
        -- & --> &
        ,xmltable(('"'|| REPLACE( replace( recipients, '&','&')  , ',', '","') || '"'))

the query works:

ID,RECIPIENT
1,foo&bar@domain.tld
1,bar@domain.tld

I'm imaging that there might be other characters that are valid in an email address, but will be problematic for Xmltable.

Is there a better way to do this?


Solution

  • You could use the built-in dbms_xmlgen.convert() function:

    with data (id, recipients) as
    (
      select 1, 'foo&bar@domain.tld,bar@domain.tld' from dual
    )
    select d.id, dbms_xmlgen.convert(x.column_value.getstringval(), 1) as recipient
    from data d
    cross join
    xmltable(('"' || replace(dbms_xmlgen.convert(d.recipients, 0), ',', '","') || '"')) x
    
            ID RECIPIENT                     
    ---------- ------------------------------
             1 foo&bar@domain.tld            
             1 bar@domain.tld                
    

    The inner call dbms_xmlgen.convert(d.recipients, 0) gives you

    foo&bar@domain.tld,bar@domain.tld
    

    After that has been modified to have double quotes around each comma-separated value and been split into multiple rows, you end up with column_value as:

    foo&bar@domain.tld
    bar@domain.tld
    

    so the outer dbms_xmlgen.convert(x.column_value.getstringval(), 1) converts any encoded entities back to their plain versions.

    If you do this in a PL/SQL context then you can use dbms_xmlgen.entity_encode and dbms_xmlgen.entity_decode instead of the fixed 0 and 1, but those aren't available in plain SQL.

    (There are only five entities to worry about anyway, but you still might as well allow for them all - whether they are valid in email addresses or not - and using a function call like this is maybe going to be less confusing to future maintainers...)