I am working on a UDF to process XML files on the Hadoop cluster. I am using PIG to load the XML files and then I use my UDF to flatten the structure of the XML data.
My current implementation is with DOM parser and I didn't have to include the DOM parser jars along with my udf jar. I am planning to shift this implementation from DOM parser to SAX parser.
Does the hadoop/pig framework provide those jars for SAX parsers out of the box or I need to have them included along with my udf jar?
My bad. I started working on the SAX Parser. And they just come right along with it.