Search code examples
snowflake-cloud-data-platformparsexml

How to ignore corrupted xml row with parse_xml in snowflake


Function parse_xml fails with message Error parsing XML: not an XML element, pos 1, this is due to one invalid xml entry, is there anyway to ignore the invalid one and proceed.

Here is a background I get's a xml data into snowflake table through a rest service which stores as an varchar, further picks this data, parses into xml and does filtering/processing. Function parse_xml working well in our case but it gave's up if any one entry invalid.

Here is an example query below, I am looking to ignore "red" and return "green" instead of throwing error.

SELECT parse_xml(XML)
FROM
  (SELECT ('<color>green</color>') XML
   UNION SELECT ('red') XML);

Solution

  • Check the XML for validity with check_xml():

    SELECT parse_xml(XML)
    FROM
      (SELECT ('<color>green</color>') XML
       UNION SELECT ('red') XML)
    WHERE check_xml(XML) is NULL ;
    

    This returns <color>green</color>, as desired.