Search code examples

XPath: nodes defined in different ways

I have an XML document that is generated by persisting an object tree into an XML. In this tree nodes (objects) of a certain type can occur in different manners:

  1. They are referenced from an other object, in XML terms this means they are a child of a node that represents an object. The first time this happens the node has an id attribute and is serialized in situ through child nodes:
<productionLocation class="" id="935">
    <wares id="936">
        < ware="BEER"/>
        < id="937">
            <amount class="" id="938">127</amount>
            <sum class="" id="939">2760.0</sum>
  1. If the same object is referenced by another node, instead of an id attribute there is a reference attribute and there are no child nodes representing the object:
<birthPlace class="" reference="935"/>
  1. Similar to the first case the node may be referenced from a collection type object (map, list, ...), in which case there will not be a class attribute but that value will be the node name. There will either be an id or a reference attribute:
<productionAndConsumption id="240">
    < id="241">
        <wares id="242">
                < ware="BEER"/>
                < id="243">
                <amount class="" id="244">43</amount>
                <sum class="" id="245">0.0</sum>

What I am trying to achieve is getting a node list that represents all nodes defining objects of type

What I have got so far is a XPath expression for the first two cases: //*[@class='' and @id]

What I struggle with is comming up with the second part that selects all nodes that have an id attribute and combining it with the above.


Given a className I need to find all nodes that have a class attribute of said className and an id attribute or all nodes className with an id attribute.


  • The XPath expression is composed by two separate expression that combined with an OR.

    The first part is from the question itself and matches the case 1) and excludes cases 2):

    //*[@class='' and @id]

    The second part selects and node with that has an id attribute:


    Combining both together:

    //*[@class='' and @id] 
    | //[@id]