Search code examples
xmlvbadombyte-order-mark

XML file output only shows Byte Order Mark


I have an XML file that I am trying to parse, whose contents are exactly the XML below:

<Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Reference>{REFERENCE-HERE}</Reference>
  <FillerTags>Filler</FillerTags>
  <entity>
    <entityName>ABC</entityName>
    <entityId>012345</entityId>
  </entity>
  <Items>
     <Item>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </Item>
     <AnotherItem> 
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </AnotherItem>
   </Items>
</Results>

I have been trying to get the code below (initially from my question here) to work. Several other users (including the code's creator) have been able to use it successfully, but when I run it the output file simply comes out as ÿþ<. I made sure to encode the file as ANSI and immediately save the Notepad file as .xml but the output still only has the byte order mark.

Option Explicit
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub ParseResults()
'Requires reference to Microsoft XML, v6.0
'Requires referenc to Microsoft Scripting Runtime
Dim xmlFilePath$, newFilePath$
Dim DOM As MSXML2.DOMDocument
Dim entity As IXMLDOMNode
Dim fso As Scripting.FileSystemObject

'# Define the file you are going to load as XML
xmlFilePath = "PATH"

'# Define an output path for where to put the modified XML
newFilePath = "NEWPATH"

'# Create our DOM object
Set DOM = CreateObject("MSXML2.DOMDocument")

'# Load the XML file
DOM.Load xmlFilePath

'# Wait until the Document has loaded
Do
    Sleep 250
Loop Until DOM.readyState = 4

'# Get the entityID node
Set entity = DOM.DocumentElement.getElementsByTagName("entityId")(0)

'# Call a subroutine to append the entity to "Item" tags
AppendEntity DOM, "Item", entity
'# Call a subroutine to append the entity to "AnotherItem" tags
AppendEntity DOM, "AnotherItem", entity

'## Create an FSO to write the new file
Set fso = CreateObject("Scripting.FileSystemObject")

'## Attempt to write the new/modified XML to file
On Error Resume Next
fso.CreateTextFile(newFilePath, True, True).Write DOM.XML
If Err Then
    '## Print the new XML in the Immediate window
    Debug.Print DOM.XML
    MsgBox "Unable to write to " & newFilePath & " please review XML in the Immediate window in VBE.", vbInformation
    Err.Clear
End If
On Error GoTo 0

'Cleanup
Set DOM = Nothing
Set fso = Nothing
Set entity = Nothing

End Sub

Sub AppendEntity(DOM As Object, tagName As String, copyNode As Object)
'## This subroutine will append child node to ALL XML Nodes matching specific string tag.
Dim itemColl As IXMLDOMNodeList
Dim itm As IXMLDOMNode

'# Get a collection of all elements matching the tagName
Set itemColl = DOM.DocumentElement.getElementsByTagName(tagName)

'# Iterate over the collection, appending the copied node
For Each itm In itemColl
    If itm.HasChildNodes Then
        '# Insert this node before the first child node of Item
        itm.InsertBefore copyNode.CloneNode(True), itm.FirstChild
    Else
        '# Append this node to the Item
        itm.appendChild copyNode.CloneNode(True)
    End If
Next

Set itm = Nothing
Set itemColl = Nothing

End Sub

To be sure, the code produces no errors - it creates a new file, but the file it creates is incorrect. The correct output should be (and is for some others who have tried this code) this:

<Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Reference>{REFERENCE-HERE}</Reference>
  <FillerTags>Filler</FillerTags>
  <entity>
    <entityName>ABC</entityName>
    <entityId>012345</entityId>
  </entity>
  <Items>
    <Item>
      <entityId>012345</entityId>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </Item>
     <AnotherItem> 
       <entityId>012345</entityId>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </AnotherItem>
   </Items>

That is, the code inserts the as a child node of each tag. In the real XML document that I am looking to apply this code to after I can get it to work on this example XML is much the same, but contains multiple entities. For example:

   <Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Reference>{REFERENCE-HERE}</Reference>
      <FillerTags>Filler</FillerTags>
      <entity>
        <entityName>ABC</entityName>
        <entityId>012345</entityId>
      </entity>
      <Items>
        <Item>
          <entityId>012345</entityId>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </Item>
         <AnotherItem> 
           <entityId>012345</entityId>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </AnotherItem>
       </Items>
     <entity>
      <entityName>DEF</entityName>
        <entityId>678910</entityId>
      </entity>
      <Items>
        <Item>
          <entityId>678910</entityId>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </Item>
         <AnotherItem> 
           <entityId>678910</entityId>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </AnotherItem>
       </Items>   

Any help troubleshooting this problem would be much appreciated.

UPDATE:

The code above now works by changing the line fso.CreateTextFile(newFilePath, True, True).Write DOM.XML to fso.CreateTextFile(newFilePath, True, False).Write DOM.XML.

I am now attempting to run this on the larger set of XML data but receive an error on the line Set entity = DOM.DocumentElement.getElementsByTagName("entityId")(0)

I had this error on the example file a few times and realized I had just forgotten to set the correct directory, but despite ensuring the directory is correct this time the error persists.

UPDATE 2: The code I am receiving this error on has been modified as below. If I'm not mistaken all I did was rename a few things, but I may be wrong.

Option Explicit
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub ParseResults()
'Requires reference to Microsoft XML, v6.0
'Requires referenc to Microsoft Scripting Runtime
Dim xmlFilePath$, newFilePath$
Dim DOM As MSXML2.DOMDocument
Dim Customer As IXMLDOMNode
Dim fso As Scripting.FileSystemObject

'# Define the file you are going to load as XML
xmlFilePath = "C:\FAKEPATH\Final_Test.xml"

'# Define an output path for where to put the modified XML
newFilePath = "C:\FAKEPATH\Final_Test1.xml"

'# Create our DOM object
Set DOM = CreateObject("MSXML2.DOMDocument.6.0")

'# Load the XML file
DOM.Load xmlFilePath

'# Wait until the Document has loaded
Do
    Sleep 250
Loop Until DOM.readyState = 4

'# Get the entityID node
Set Customer = DOM.DocumentElement.getElementsByTagName("CustomerId")(0)

'# Call a subroutine to append the entity to "Item" tags
AppendCustomer DOM, "Transaction", Customer

'## Create an FSO to write the new file
Set fso = CreateObject("Scripting.FileSystemObject")

'## Attempt to write the new/modified XML to file
On Error Resume Next
'MsgBox DOM.XML
fso.CreateTextFile(newFilePath, True, False).Write DOM.XML
If Err Then
    '## Print the new XML in the Immediate window
    Debug.Print DOM.XML
    MsgBox "Unable to write to " & newFilePath & " please review XML in the Immediate window in VBE.", vbInformation
    Err.Clear
End If
On Error GoTo 0

'Cleanup
Set DOM = Nothing
Set fso = Nothing
Set Customer = Nothing

End Sub

Sub AppendCustomer(DOM As Object, Transaction As String, copyNode As Object)
'## This subroutine will append child node to ALL XML Nodes matching specific string tag.
Dim itemColl As IXMLDOMNodeList
Dim itm As IXMLDOMNode

'# Get a collection of all elements matching the tagName
Set itemColl = DOM.DocumentElement.getElementsByTagName(Transaction)

'# Iterate over the collection, appending the copied node
For Each itm In itemColl
    If itm.HasChildNodes Then
        '# Insert this node before the first child node of Item
        itm.InsertBefore copyNode.CloneNode(True), itm.FirstChild
    Else
        '# Append this node to the Item
        itm.appendChild copyNode.CloneNode(True)
    End If
Next

Set itm = Nothing
Set itemColl = Nothing

End Sub

UPDATE 3: Everything now works perfectly. The only issue is in the actual procedure pursued by the code above. Since there are multiple entities, and each set of items belongs to an entity, the code needs to find an entityId and apply this entityId to all the items that come BEFORE another occurrence of an entityId tag. After this point everything would repeat.


Solution

  • I'm putting this here initially as an answer so I can show my code legibly. Will delete if this also fails. Try this syntax to use the alternative method of writing the file. Notepadd++ tells me this is ANSII:

    '## Create an FSO to write the new file'
    Set fso = CreateObject("Scripting.FileSystemObject")
    
    
    Dim FF As Integer
    FF = FreeFile
    '## Attempt to write the new/modified XML to file'
    fso.CreateTextFile newFilePath
    Open newFilePath For Output As FF
    Print #FF, dom.XML
    Close #FF
    

    Alternatively

    (again, just covering the bases, and will update or remove if needed)

    Try:

    fso.CreateTextFile(newFilePath, True, False).Write DOM.XML
    

    The difference being the third argument in the CreateTextFile method specifies whether to create the file as Unicode (True) or ASCII (False).

    Notepad++ confirms this method is ANSII, whereas if I do True to create Unicode file, I get a UCS-2 Little Endian file.

    I personally notice no difference between either Ascii/Unicode -- I can open both in Notepad or Notepad++ and they appear the same to me, but since this seems like it could be a character-encoding issue, it is worth a shot. I suggested it only as the first (and easiest) option to implement (there are some more options to explore if needed).

    Update #3

    To address the nested nature of the file... basically you have XML element siblings ("entity" and "Item"), and you need to modify the "Item" (and it's child nodes) to include the "entityId" (which is a child of "entity"). I'm explaining this relationship so that hopefully this modification makes sense!

    '##### NO LONGER USED:'
    '# Get the entityID node'
    'Set Customer = DOM.DocumentElement.getElementsByTagName("CustomerId")(0)'
    
    Dim itm As IXMLDOMNode
    
    '# Instead of getting the first item like we did before, we can iterate the collection'
    ' of nodes with the entityID tag like so:'
    For Each Customer In DOM.DocumentElement.getElementsByTagName("entityId")
       'Since Item is Entity nextSibling, and Entity is parent of entityId,'
       ' we can iterate the collection if its childNodes like this:'
        For Each itm In Customer.ParentNode.NextSibling.ChildNodes
            If itm.HasChildNodes Then
                '# Insert this node before the first child node of Item'
                itm.InsertBefore Customer.CloneNode(True), itm.FirstChild
            Else
                '# Append this node to the Item'
                itm.appendChild Customer.CloneNode(True)
            End If
        Next
    Next
    
    '##### This function call is no longer needed
    'AppendCustomer DOM, "Transaction", Customer'
    

    This produces XML like:

    <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
        <Results>
            <Reference>{REFERENCE-HERE}</Reference>
            <FillerTags>Filler</FillerTags>
            <entity>
                <entityName>ABC</entityName>
                <entityId>012345</entityId>
            </entity>
            <Items>
                <Item>
                    <entityId>012345</entityId>
                    <FillerTagsAgain>Filler1</FillerTagsAgain>
                    <FillerTagsAgain>Filler1</FillerTagsAgain>
                    <FillerTagsAgain>Filler1</FillerTagsAgain>
                </Item>
                <AnotherItem>
                    <entityId>012345</entityId>
                    <FillerTagsAgain>Filler2</FillerTagsAgain>
                    <FillerTagsAgain>Filler2</FillerTagsAgain>
                    <FillerTagsAgain>Filler2</FillerTagsAgain>
                </AnotherItem>
            </Items>
        </Results>
        <Results>
            <Reference>{REFERENCE-HERE}</Reference>
            <FillerTags>Filler</FillerTags>
            <entity>
                <entityName>DEF</entityName>
                <entityId>54321</entityId>
            </entity>
            <Items>
                <Item>
                    <entityId>54321</entityId>
                    <FillerTagsAgain>Filler1</FillerTagsAgain>
                    <FillerTagsAgain>Filler1</FillerTagsAgain>
                    <FillerTagsAgain>Filler1</FillerTagsAgain>
                </Item>
                <AnotherItem>
                    <entityId>54321</entityId>
                    <FillerTagsAgain>Filler2</FillerTagsAgain>
                    <FillerTagsAgain>Filler2</FillerTagsAgain>
                    <FillerTagsAgain>Filler2</FillerTagsAgain>
                </AnotherItem>
            </Items>
        </Results>
    </root>