Search code examples
xmlvbscriptxmldom

How can I get the childnode.element.text from an element that is specified by it's attribute?


I have an xml file that looks like this;

[data.xml]

<?xml version="1.0"?>
    <elem1 id="obj1">
        <celem1>Text1</celem1><celem2>Text2</celem2>
    </elem1>
    <elem2 id="obj2">
        <celem1>Text3</celem1><celem2>Text4</celem2>
    </elem2>

And a read xml function that looks like this;

Function GetVar(XMLTag, strNum)
   Set oXMLFile = CreateObject("Msxml2.DOMDocument")
       oXMLFile.Load("data.xml")
   Set oXMLFileVariable = oXMLFile.getElementsByTagName(XMLTag)
       GetVar = oXMLFileVariable.Item(strNum).Text
End Function

Calling the function like this;

    Call GetVar("celem1", 0)
    Call GetVar("celem2", 0)
    Call GetVar("celem1", 1)
    Call GetVar("celem2", 1)

will return;

"Text1"
"Text2"
"Text3"
"Text4"

I would like to be able to return the the childnode element.text by specifying its parentnode's attribute. Something like this;

[psuedo code - forgive me if I'm way off here]

    GetChildNode.Text(elem1(GetAttribute="obj1").celem1())

Would return something like this;

"Text1"

The reason I ask is because I would like to remove specific element names in favor of generic ones, and then be able to to call out specific element.text information by specifying attributes. I don't like creating and maintaining a unique element tag for every new entry in the xml doc.

I'm currently using VBscript, but I could change to something else (windows environment) that would work.

[---EDIT---]

Using Ansgar Wiechers examples, I have created the following;

[data.xml]

<?xml version="1.0"?>
  <elem id="obj1">
      <celem id="item1">Text1</celem><celem id="item2">Text2</celem>
  </elem>
  <elem id="obj2">
      <celem id="item1">Text3</celem><celem id="item2">Text4</celem>
  </elem>

And the script;

str1 = GetVar("obj1", "celem", "item1")
str2 = GetVar("obj2", "celem", "item2")
MsgBox str1
MsgBox str2

Function GetVar(parentID, childNode, childAtt)
    GetVar = Null 'set a safe default return value

    Set oXMLFile = CreateObject("Msxml2.DOMDocument.6.0")
    oXMLFile.async = False
    oXMLFile.Load "xpath.xml"

    If oXMLFile.parseError = 0 Then
            xpath = "//*[@id='" & parentID & "']/" & childNode _
               & "[@id='" & childAtt & "']"
            Set node = oXMLFile.selectSingleNode(xpath)
            If Not node Is Nothing Then GetVar = node.text
    Else
            'report errors
            WScript.Echo oXMLFile.parseError.reason
    End If
End Function


The first MsgBox will return "Text1"
The second MsgBox will return "Text4"

This is exactly what I was looking for!!


Solution

  • Use an XPath expression for selecting the node(s):

    Function GetVar(parentId, childNode)
      GetVar = Null  'set a safe default return value
    
      Set oXMLFile = CreateObject("Msxml2.DOMDocument.6.0")
      oXMLFile.async = False
      oXMLFile.load "data.xml"
    
      'Without having set async to False the above instruction would load the
      'file in the background, so your code would try to process data that
      'isn't completely loaded yet.
    
      If oXMLFile.parseError = 0 Then
        xpath = "//*[@id='" & parentId & "']/" & childNode
        Set node = oXMLFile.selectSingleNode(xpath)
        If Not node Is Nothing Then GetVar = node.text
      Else
        'report errors
        WScript.Echo oXMLFile.parseError.reason
      End If
    End Function
    

    An XPath expression //*[@id='obj1']/celem1 means: select any node <celem1> that has a parent node with an attribute id with a value obj1. By using the selectSingleNode method you select just the first occurrence of such a node (if any).

    If there can be more than one matching node and you don't want just the value of the first one returned, you need to define how you want to handle the values of the other nodes.