I created a C++ application that reads in XML files with the RapidXML parser. At one XML file that was shaped exactly the same as another one that worked, the parser threw an error:
"expected <"
The last five characters before the error were from the closing tag of the root element, so the error happened at the end-of-file:
</UW>
What I suspect this error to be related to, is a whitespace skipping bug being an issue with RapidXML v1.12 (I am using v1.13). I used no parsing flags (doc.parse<0>(bfr);).
According to this site, the bug was believed to be caused by faulty implementation of the "parse_trim_whitespace" parse flag. A patch was provided on that site, but there also seemed to be a problem with that patch.
The following is the XML document that caused this error. What I also don't understand - besides the reason for the error - is why the error didn't happen parsing another file with content of the same fashion. My application also successfully parses several other files before that file.
<?xml version="1.0" encoding="UTF-8"?>
<UW>
<Bez>EV005</Bez>
<Herst>Trumpf</Herst>
<Gesw>16</Gesw>
<Rad>1.6</Rad>
<Hoehe>100</Hoehe>
<Wkl>30</Wkl>
<BgVerf>Freibiegen</BgVerf>
<MaxBel>50</MaxBel>
<Kontur>0</Kontur>
<Grafik>0</Grafik>
</UW>
Part of my application were the error occours (this is the inside of a loop):
// Get "Bezeichnung" attribute
attr = subnode->first_attribute("Bezeichnung");
if ( !attr ){ err(ERR_FILE_INVALID,"Werkzeuge.xml"); return 0; }
name = attr->value();
// Get file name/URL
string fileName = name;
fileName.append(".xml");
// Open file
ifstream werkzeugFile(concatURL(PFAD_WERKZEUGE,fileName));
if(!werkzeugFile.is_open()) { err(ERR_FILE_NOTFOUND,fileName); return 0; }
// Get length
werkzeugFile.seekg(0,werkzeugFile.end);
int len = werkzeugFile.tellg();
werkzeugFile.seekg(0,werkzeugFile.beg);
// Allocate buffer
char * bfr = new char [len+1];
werkzeugFile.read(bfr,len);
werkzeugFile.close();
// Parse
SetWindowText(hwndProgress,"Parsing data: Werkzeuge/*.xml");
btmDoc.parse<0>(bfr);
// Get type of tool & check validity
xml_node<> *rt_node = btmDoc.first_node();
if ( strcmp(rt_node->name(),"OW") == 0 ){
isOW = true;
}
else if ( strcmp(rt_node->name(),"UW") == 0 ){
isUW = true;
}
else { err(ERR_FILE_INVALID,fileName); return 0; }
// Prepare for next loop iteration
delete[] bfr;
btmDoc.clear();
subnode = subnode->next_sibling();
Ah, I think I see it. Two things:
First, the ifstream
is suspicious -- shouldn't it be opened in binary mode if you're jumping around in it using byte offsets (and somebody else is doing the parsing)? Passstd::ios::in | std::ios::binary
as the second argument to the ifstream
constructor.
Second, your memory management seems fine, except that you allocate one byte extra (the +1
) but never seem to make use of it. I'm assuming you're missing bfr[len] = '\0';
after the contents are read in -- this explains the odd parse error at the end of the file, since the XML parser doesn't know it reached the end of the file -- it's parsing a null terminated string that isn't null terminated, and tries to parse random bytes of memory ;-)