Search code examples
excelvbaxpathxmldom

VBA - What's The XPath Syntax To Get Tag Names


I am trying to use a VBA macro to parse XML file. Given with the following structure:

  <bookstore>
  <book category="children">
    <title>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="web">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
  </bookstore>

How can I enumerate the output with element tags with its corresponding values as shown below?

book | category | children
title | harry potter
author | J K. Rowling
...

My code as follows:

Set xmlFile = CreateObject("Microsoft.XMLDOM")
xmlFile.Load (file)
Set qXML = xmlFile.SelectNodes("/bookstore")
For i = 0 To qXML.Length - 1
  Debug.Print CStr(qXML(i).Text)
Next i

Solution

  • How to get Tag Names

    "What's the XPath syntax to get Tag Names?"

    Strictly speaking it's (XML)DOM syntax to to get .Name and/or .NodeName properties; XMLDOM (Document Object Model) is a cross-platform and language-independent interface treating the document as a tree structure and allowing programmatic access to the tree.

    You can use, however the special syntax of XPath expressions (like e.g. "/bookstore/book/title") to address any logical part in the hierarchical xml document structure.

    So a solution close to your OP would be:

    Option Explicit             ' declaration head of your code module
    
    Sub ExampleCall()
        Dim file As String: file = ThisWorkbook.Path & "\xml\bookstore.xml"
        Dim xmlFile As Object
        Set xmlFile = CreateObject("Microsoft.XMLDOM")
        If xmlFile.Load(file) Then
            Dim qXML As Object
            Set qXML = xmlFile.DocumentElement.SelectNodes("book")
            Dim q As Object
            For Each q In qXML
                Dim cnt As Long: cnt = cnt + 1
                Debug.Print Format(cnt, "--- 000 ---")
                Debug.Print q.Attributes(0).Name, "|" & q.Attributes(0).Text
                Dim i As Long
                For i = 0 To q.ChildNodes.Length - 1
                    Debug.Print q.ChildNodes(i).nodeName, "|" & q.ChildNodes(i).Text
                Next
            Next
        End If
    End Sub
    
    

    Results in VBE's immediate window

    --- 01 ---
    category      |children
    title         |Harry Potter
    author        |J K. Rowling
    year          |2005
    price         |29.99
    --- 02 ---
    category      |web
    title         |Learning XML
    author        |Erik T. Ray
    year          |2003
    price         |39.95
    
    

    Side note

    As Microsoft.XMLDOM has been deprecated for years, I'd prefer binding to ►MSXML2 in the most current xml version Microsoft XML,v6.0, e.g. via

    I. LATE Binding (as in OP)

        Dim xDoc As Object
        Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")
    

    II. EARLY Binding

        Dim xDoc As MSXML2.DOMDocument60     ' *) whereas MSXML2.DOMDocument (=old version 3.0)
        Set xDoc = New MSXML2.DOMDocument60  ' mind the missing point in digits
    

    Side note: OP uses the object variable XMLFile instead of xDoc

    Note that referencing DOMDocument without obvious versioning would bind internally to 3.0 by default (the last stable version before 6.0, any other versions are deprecated).

    Further links