I'm using a SAX parser with a custom handler to parse some XML files. This works well so far, but I want to check more than only the well-formedness of the given file and use validation via an XSD Scheme, which also contains default values for optional attributes. There are lots of tutorials online on doing this, but I was not able to find a way that satisfies all my constraints, which are as follows:
-I don't know the scheme beforehand, I have a bunch of XML and XSD files and every XML contains information about the XSD it should conform to
-The validatior should alter the stream the handler gets and insert the default values for optional attributes from the XSD if necessary
-The current custom handler should be used
I'm fairly new to this topic, so I can't preclude that I've stumbled over the solution without beeing aware of it, but I'm currently completely confused on how to do this.
Here is a minimum SSCCE, which should show the problem and related parts:
package parserTest;
import java.io.File;
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.validation.TypeInfoProvider;
import javax.xml.validation.ValidatorHandler;
import org.w3c.dom.ls.LSResourceResolver;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.ErrorHandler;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ParserTest
{
public final static void main(String[] args)
{
//Initialize SAX parser
final SAXParserFactory saxFactory = SAXParserFactory.newInstance();
SAXParser saxParser = null;
try
{
saxParser = saxFactory.newSAXParser();
}
catch(ParserConfigurationException confEx){confEx.printStackTrace();}
catch (SAXException saxEx){saxEx.printStackTrace();}
//Initialize Handler
DefaultHandler saxHandler = new CustomHandler();
ValidatorHandler vh = new ValidatorHandler()
{
@Override
public void startPrefixMapping(String prefix, String uri) throws SAXException{}
@Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException{}
@Override
public void startDocument() throws SAXException{}
@Override
public void skippedEntity(String name) throws SAXException{}
@Override
public void setDocumentLocator(Locator locator){}
@Override
public void processingInstruction(String target, String data) throws SAXException{}
@Override
public void ignorableWhitespace(char[] ch, int start, int length)throws SAXException{}
@Override
public void endPrefixMapping(String prefix) throws SAXException{}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException{}
@Override
public void endDocument() throws SAXException{}
@Override
public void characters(char[] ch, int start, int length) throws SAXException{}
@Override
public void setResourceResolver(LSResourceResolver resourceResolver){}
@Override
public void setErrorHandler(ErrorHandler errorHandler){}
@Override
public void setContentHandler(ContentHandler receiver){}
@Override
public TypeInfoProvider getTypeInfoProvider(){return null;}
@Override
public LSResourceResolver getResourceResolver(){return null;}
@Override
public ErrorHandler getErrorHandler(){return null;}
@Override
public ContentHandler getContentHandler(){return null;}
};
vh.setContentHandler(saxHandler);
//Do the parsing
File input = new File("");
try
{
saxParser.parse(input, saxHandler);
//saxParser.parse(input, vh); //<-- First attempt, gives me error message
//saxParser.setContentHandler(vh); //<-- Second attempt, but my parser does not seem to know this method
}
catch (IOException ioEx){ioEx.printStackTrace();}
catch (SAXException saxEx){saxEx.printStackTrace();}
}
/*
* This class is the handler to be used only by this class.
*/
static private final class CustomHandler extends DefaultHandler
{
//Handle start of element
public final void startElement(String namespaceURI, String localName, String qName, Attributes atts){}
//Handle end of Element
public final void endElement(String namespaceURI, String localName, String qName){}
//Handle start of characters
public final void characters(char[] ch, int start, int length){}
}
}
The basic principle is to insert a ValidatorHandler between the SAX parser and your ContentHandler
https://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/validation/ValidatorHandler.html
ValidatorHandler vh = new ValidatorHandler();
vh.setContentHandler(originalContentHandler);
parser.setContentHandler(vh);
The tricky bit is that in order to create a ValidatorHandler, you need to know what schema is in use. How is it identified? If it uses the xsi:schemaLocation attribute, then you can (probably) get the ValidatorHandler to pick it up automatically. If it uses some custom mechanism, you may have to do a "prepass" reading (some of) the source file to discover the schema, then reading it again with the ValidatorHandler in place.
Your ContentHandler will be notified of default values for optional attributes.