Search code examples
excelxmlvbaxmldom

Why Object.Selectnodes(XPath) gets 1st node value if former node is empty


Why Object.SelectNodes(XPath) gets as 1st node value second node value if former node (real 1st node value) is empty. Example below:

XML:

<?xml version="1.0" encoding="UTF-8"?>
<Document>
    <person>
    </person>
    <person>
           <name>Peter</name>
    </person>
</Document>

VBA code:

Dim j as Integer
Dim FileToOpen as Variant
FileToOpen = Application.GetOpenFilename(Filefilter:="XML Files (*.xml), *.xml", _
Title:="Choose XML document ", MultiSelect:=False)
Set XDoc = CreateObject("MSXML2.DOMDocument")
    XDoc.async = False: XDoc.validateOnParse = False
    XDoc.Load FileToOpen

    For j = 1 To 2
        Set tofields = XDoc.SelectNodes("//Document/person/name")
        If Not (tofields.Item(j)) Is Nothing Then
            Debug.Print tofields.Item(j).Text
        Else
            Debug.Print "Nothing"
        End If
    Next j

Result:

Peter
Nothing

Why is not "Nothing" on the first place in the result ? How to reach that ? If parent node does not include 1st child node, 1st iteration is omitted. Thank you.


Solution

  • Enumeration in XMLDOM differs from XPath

    XMLDOM syntax enumerates nodes as zero based items in a NodeList, i.e. starting from 0, whereas XPath expressions identifying subnodes start from 1 (e.g. calling a first name item "//name[1]"). This mistakes in considering 2 as the last item index instead of looping from .Item(0) to .Item(1) in your example code. You get the number of found items results via the node list's .Length method (2 minus 1 results in 1 as the last index number, thus giving a name node series of 0 to 1).

    Furthermore it's recommended to reference MSXML2 version 6.0 (MSXML2.DOMDocument refers to the last stable version 3.0 used only for compatibility reasons).

    Further hints assuming that you want loop through all persons in your XML document (the nodelist in the OP delivers one item only, as the name node exists only once):

    • The expressions xDoc.SelectNodes("//Document/person") or //person would search the defined node set at any hierarchy Level within a given node structure. So it's less time consuming to use unambigously Set toFields = xDoc.DocumentElement.SelectNodes("person") in your case.
    • The following code example wouldn't show the Nothing case, as the node list displays two name nodes only (For i = 0 To toFields.Length - 1). Just in order to check your original attempt you could enumerate up to three items by changing intently to For i = 0 To toFields.Length (i.e. 0 to 2).

    Additional link

    Analyze your XML structure via recursive calls; a working function can be found at Parse XML using XMLDOM.

    Code example

        Dim xDoc As Object, toFields As Object
        Dim myName As String, i As Long
        Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")   ' recommended version 6.0 (if late bound MSXML2)
        xDoc.async = False: xDoc.validateOnParse = False
        ' ...
    
        Set toFields = xDoc.DocumentElement.SelectNodes("person")
        For i = 0 To toFields.Length - 1
            If Not toFields.Item(i) Is Nothing Then
                If toFields.Item(i).HasChildNodes Then
                   myName = toFields.Item(i).SelectSingleNode("name").Text
                   Debug.Print i, IIf(Len(Trim(myName)) = 0, "**Empty name", myName)
    
                Else
                   Debug.Print i, "**No name node**"
                End If
            Else
                Debug.Print i, "**Nothing**"            ' shouldn't be needed from 0 to .Length-1 items :-)
            End If
        Next i