I'm trying to load a simple Xml file (encoded in UTF-8):
<?xml version="1.0" encoding="UTF-8"?>
<Test/>
And save it with MSXML in vbscript:
Set xmlDoc = CreateObject("MSXML2.DOMDocument.6.0")
xmlDoc.Load("C:\test.xml")
xmlDoc.Save "C:\test.xml"
The problem is, MSXML saves file in ANSI instead of UTF-8 (despite the original file being encoded in UTF-8).
The MSDN docs for MSXML says that save() will write the file in whatever encoding the XML is defined in:
Character encoding is based on the encoding attribute in the XML declaration, such as . When no encoding attribute is specified, the default setting is UTF-8.
But this is clearly not working at least on my machine.
How can MSXML save in UTF-8?
There isn't any non-ANSI text in your XML file, so it will be identical whether UTF-8 or ASCII encoded. In my tests, after adding non-ASCII text to test.xml, MSXML always saves in UTF-8 encoding and also writes the BOM if there was one to begin with.
http://en.wikipedia.org/wiki/UTF-8
http://en.wikipedia.org/wiki/Byte_order_mark