Search code examples
ms-accessxml-parsingvbamsxml

Parse XML File with VBA


I have a XML file with a structure similar to this:

<egh_eval>
<eval_set>
    <eval_id>FLOAT</eval_id>
    <eval_d>
        <height>INT</height>
        <weight>INT</weight>
    </eval_d>
    <eval_e>
        <height>INT</height>
        <weight>INT</weight>
    </eval_e>
    <eval_cred>
        <credit>FLOAT</credit>
    </eval_cred>
</eval_set>

I need to parse the complete file and put it in a table. (Note: eval_d and eval_e actually have more than a hundred attributes each). I tried using MSXML2 however I get stuck when I try to parse the file. By using the answers at How to pase XML in VBA and Parse XML in VBA I was able to get there :

Dim fSuccess As Boolean
Dim oDoc As MSXML2.DOMDocument
Dim oRoot As MSXML2.IXMLDOMNode ' Level 0 egh_eval
Dim oChild As MSXML2.IXMLDOMNode ' Level 1 eval_set
Dim oChildren As MSXML2.IXMLDOMNode ' Level 2 eval_id, eval_d, eval_e, eval_cred


Dim domList As MSXML2.IXMLDOMNodeList

Set oDoc = New MSXML2.DOMDocument
oDoc.async = False
oDoc.validateOnParse = False

fSuccess = oDoc.Load(Application.CurrentProject.Path & "\file.xml")

Set oRoot = oDoc.documentElement
Set oChild = oRoot.childNodes(0)
Set oChildren = oChild.childNodes(0)

For i = 0 To oChild.childNodes.length - 1
    For y = 0 To oChildren.childNodes.length - 1
        MsgBox oChildren.nodeName & " : " & oChildren.nodeTypedValue
        oChildren.childNodes.nextNode
    Next
    oChild.childNodes.nextNode
Next

However, instead of giving me the right values, it gives me the float in eval_id 4 times...

Thanks !

EDIT: I am using Microsoft Access 2002 SP3


Solution

  • Your loop is all wrong. Don't use a counted loop. There is For Each which will do exactly what you need, and it's much more readable, too.

    Dim egh_eval As MSXML2.IXMLDOMNode
    Dim eval_set As MSXML2.IXMLDOMNode
    Dim eval_prop As MSXML2.IXMLDOMNode
    
    Set egh_eval = oDoc.documentElement.childNodes(0)
    
    For Each eval_set In egh_eval.childNodes
      If eval_set.nodeType = NODE_ELEMENT Then
        For Each eval_prop In eval_set.childNodes
          If eval_prop.nodeType = NODE_ELEMENT Then
            MsgBox eval_prop.nodeName & " : " & eval_prop.childNodes.length
          End If
        Next eval_prop
      End If
    Next eval_set
    

    When you use childNodes you must check the nodeType property. Comments, text nodes and so on will all be in the list of child nodes, not just element nodes.

    It might be a good idea to look into using XPath to select your elements. Doing this with DOM methods is error-prone and cumbersome. Read up on IXMLDOMNode::selectNodes and IXMLDOMNode::selectSingleNode.

    For Each eval_set In oDoc.selectNodes("/egh_eval/eval_set")
      Set eval_id = eval_set.selectSingleNode("eval_id")
    
      ' always check for empty search result!
      If Not eval_id Is Nothing Then
        MsgBox eval_id.text
        ' ...
      End If
    Next eval_set
    

    Also, on a general note. This:

    fSuccess = oDoc.Load(Application.CurrentProject.Path & "\file.xml")
    

    is actually both not necessary and a bad idea, since you do never seem to check the value of fSuccess). Better:

    Sub LoadAndProcessXml(path As String)
      Dim oDoc As MSXML2.DOMDocument
    
      If oDoc.Load(path) Then
        ProcessXmlFile oDoc
      Else
        ' error handling
      End If
    End Sub
    
    Sub ProcessXml(doc As MSXML2.DOMDocument) 
      ' Process the contents like shown above
    End Sub
    

    Creating multiple subs/functions has several advantages

    • Makes error handling much easier, since every function only has one purpose.
    • You will need less variables, since you can define some variables in the function arguments
    • Code will become more maintainable, since it's more obvious what 3 short functions do than what one long function does.