I am opening a XML file using .NET XmlReader and saving the file in another filename and it seems that the DOCTYPE declaration changes between the two files. While the newly saved file is still valid XML, I was wondering why it insisted on changing original tags.
Dim oXmlSettings As Xml.XmlReaderSettings = New Xml.XmlReaderSettings()
oXmlSettings.XmlResolver = Nothing
oXmlSettings.CheckCharacters = False
oXmlSettings.ProhibitDtd = False
oXmlSettings.IgnoreWhitespace = True
Dim oXmlDoc As XmlReader = XmlReader.Create(pathToOriginalXml, oXmlSettings)
Dim oDoc As XmlDocument = New XmlDocument()
oDoc.Load(oXmlDoc)
oDoc.Save(pathToNewXml)
The following (in the original document):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd">
becomes (notice the [ ] characters at the end):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd"[]>
Probably the library parses the DOCTYPE element into an internal structure and then converts the structure back to text. It doesn't store the original string form.