Search code examples
javaparsingsax

Sax parser: Ignoring HTML


I am using the sax parser to parse a XML file. It works fine, but I don't want to parse the content of an <info> tag as it contains HTML which I want to save to a string. Can anyone tell me is there any way to go about doing this?.

Thanks


Solution

  • Though question. The best might be to preprocess the stream, escaping the part between <info> and </info> yourself. You could for example write a wrapper around the input stream that transforms your input on the fly, such that what the SAX parser gets is valid XML only.