Search code examples
sqlxmloracle-databasesqlpluscdata

How can i get CDATA value from xml


Thanks for click,

  1. I need value in CDATA

  2. If I want to find someone whose name is Maria and her birthday is 2012-03-12 How do I find it??

SELECT t.doc.extract('/RESPONSE/INFO/STU_NAME/text()').getStringVal()
"stu"
FROM table t
WHERE  t.doc.extract('/RESPONSE/INFO/STU_NAME/text()').getStringVal()
LIKE '%M%'

I done try it, but you see below is the result

<![CDATA[ Maria ]]>

I don't need <![CDATA[]]>

// XML
<STUDENT>
<INFO>
  <STU_NAME><![CDATA[ Maria ]]></STU_NAME>
  <STU_WARN><![CDATA[ one ]]></STU_WARN>
  <BIRTHDAY><![CDATA[ 2012-03-12 ]]></BRITHDAY>
</INFO>
<INFO>
  <STU_NAME><![CDATA[ Kevin) ]]></STU_NAME>
  <STU_WARN><![CDATA[ one ]]></STU_WARN>
  <BIRTHDAY><![CDATA[ 2010-07-15 ]]></BRITHDAY>
</INFO>

Solution

  • The extract function has been deprecated for a long time (at least since 11gR2 - see the note in that documentation).

    If you have multiple values and might want to see more than one you could use XMLTable, which removes the CDATA noise (but might need a trim as you have spaces in the values):

    select x.stu_name, x.birthday,
      trim(stu_name) as stu_name2, to_char(x.birthday,'YYYY-MM-DD') as birthday2
    from your_table t
    cross join xmltable ('/RESPONSE/INFO' passing t.doc
      columns
        stu_name varchar2(30) path 'STU_NAME',
        birthday date path 'BIRTHDAY'
    ) x
    where x.stu_name like '%M%'
    
    STU_NAME BIRTHDAY  STU_NAME2 BIRTHDAY2
    -------- --------- --------- ---------
     Maria   12-MAR-12 Maria     2012-03-12
    

    If you're targeting a single value you could also use xmlquery, which is closer to your extract:

    select regexp_replace(
      xmlquery('/RESPONSE/INFO[contains(BIRTHDAY, "2012-03-12")]/STU_NAME/text()'
        passing doc
        returning content),
      '<!\[CDATA\[ *(.*?) *\]\]>', '\1') as stu_name
    from your_table t
    
    STU_NAME
    --------
    Maria
    

    Here's I've looked for the birthday you wanted as text within a node, and got the matching name; but as that still has the CDATA it is roughly the same as what you had. So, I've used a regular expression to strip the CDATA part, though you could also use substr/instr if performance is a concern.

    db<>fiddle