I'm trying to read in an RSS Feed/XML file into my application. The problem is that there's a BOM (Byte Order Mark) that my inputStream doesn't like and it throws an error which throws another error and everything dies.
Here's the method:
private Document getDomFromXMLString(String xml) {
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xml));
doc = db.parse(is);
} catch (Exception e) {
e.printStackTrace();
}
return doc;
}
So I'm trying to figure out how to effectively skip the BOM and input the rest of the file
If you have a character stream, and a String
is, then skipping the BOM is as easy as stripping the first character, which is the BOM:
if (xml.charAt(0) == '\ufeff')
xml = xml.substring(1);
What you should really do, though, is ask the source to fix its feed; the BOM shouldn't be there in the first place.