Search code examples
javaxmlxml-parsingsaxsaxparser

SAX Using Multiple DefaultHandler Extensions


The issue I am having is I want to parse out 1 XML document using extensions of the SAX defaultHandler class. When using just one handler I can parse out XML and assign different tags to properties of an object (see domain and area). Once I have these in the domain and area object I want to add them to another object (GroupedFiles) which has a list of of domains and areas. The issue I am having is parsing out 1 document using 2 handlers. My theory, albeit not the best coding practice, is to parse out the document twice, run through it once using the domain handler, set the domains, then add to grouped files, then do the same for area. Here is the code,

GroupedFiles groupedFiles = new GroupedFiles();
ArrayList<Domain> domains = new ArrayList<Domain>();
ArrayList<Area> areas = new ArrayList<Area>();

//Create parser from factory
XMLReader parser = XMLReaderFactory.createXMLReader();

//Creates an input stream from the file "someFile.xml"
InputStream in = new FileInputStream(new File("someFile.xml"));
InputSource source = new InputSource(in);

//Create handler instances
DomainHandler domainHandler = new DomainHandler();
AreaHandler areaHandler = new AreaHandler();

//Parses out XML from a document using each handler, 
//adding it to an object with the correct properties then adds those
//to another object which features Lists of Domains, Areas, and Directories.
parser.setContentHandler(domainHandler);
parser.parse(source);
domains = domainHandler.getXML();
groupedFiles.setDomain(domains);

parser.setContentHandler(areaHandler);
parser.parse(source);
areas = areaHandler.getXML();
groupedFiles.setArea(areas);

However I am unable to, and it appears to hang up on the second parser.parse(source). If I look at the groupedFiles after running the domain is populated but areas isn't. Any advice?


Solution

  • You don't need to make multiple passes, you can swap content handlers during the process of parsing. XMLReader has a setContentHandler method that you can call to pass in a new handler. For instance, you can set a new content handler if in startElement you recognize a tag that begins an area covered by a different handler, or if you're leaving the element in endElement and want to switch back to a previous content handler.

    For examples see this JavaWorld article or check out this answer.