Search code examples
javaxpathxstream

XPath: nodes defined in different ways


I have an XML document that is generated by persisting an object tree into an XML. In this tree nodes (objects) of a certain type can occur in different manners:

  1. They are referenced from an other object, in XML terms this means they are a child of a node that represents an object. The first time this happens the node has an id attribute and is serialized in situ through child nodes:
<productionLocation class="ch.sahits.game.openpatrician.model.city.impl.City" id="935">
    <wares id="936">
        <entry>
        <ch.sahits.game.openpatrician.model.product.EWare ware="BEER"/>
        <ch.sahits.game.openpatrician.model.product.AmountablePrice id="937">
            <amount class="javafx.beans.property.SimpleIntegerProperty" id="938">127</amount>
            <sum class="javafx.beans.property.SimpleDoubleProperty" id="939">2760.0</sum>
        </ch.sahits.game.openpatrician.model.product.AmountablePrice>
        </entry>
    </wares>
    <name>London</name>
    ...
</productionLocation> 
  1. If the same object is referenced by another node, instead of an id attribute there is a reference attribute and there are no child nodes representing the object:
<birthPlace class="ch.sahits.game.openpatrician.model.city.impl.City" reference="935"/>
  1. Similar to the first case the node may be referenced from a collection type object (map, list, ...), in which case there will not be a class attribute but that value will be the node name. There will either be an id or a reference attribute:
<productionAndConsumption id="240">
    <entry>
    <ch.sahits.game.openpatrician.model.city.impl.City id="241">
        <wares id="242">
            <entry>
                <ch.sahits.game.openpatrician.model.product.EWare ware="BEER"/>
                <ch.sahits.game.openpatrician.model.product.AmountablePrice id="243">
                <amount class="javafx.beans.property.SimpleIntegerProperty" id="244">43</amount>
                <sum class="javafx.beans.property.SimpleDoubleProperty" id="245">0.0</sum>
                </ch.sahits.game.openpatrician.model.product.AmountablePrice>
            </entry>
        </wares>
        <name>London</name>
        ...
    </ch.sahits.game.openpatrician.model.city.impl.City>
    </entry>
</productionAndConsumption>

What I am trying to achieve is getting a node list that represents all nodes defining objects of type ch.sahits.game.openpatrician.model.city.impl.City.

What I have got so far is a XPath expression for the first two cases: //*[@class='ch.sahits.game.openpatrician.model.city.impl.City' and @id]

What I struggle with is comming up with the second part that selects all nodes ch.sahits.game.openpatrician.model.city.impl.City that have an id attribute and combining it with the above.

Clarification:

Given a className I need to find all nodes that have a class attribute of said className and an id attribute or all nodes className with an id attribute.


Solution

  • The XPath expression is composed by two separate expression that combined with an OR.

    The first part is from the question itself and matches the case 1) and excludes cases 2):

    //*[@class='ch.sahits.game.openpatrician.model.city.impl.City' and @id]
    

    The second part selects and node with ch.sahits.game.openpatrician.model.city.impl.City that has an id attribute:

    //ch.sahits.game.openpatrician.model.city.impl.City[@id]
    

    Combining both together:

    //*[@class='ch.sahits.game.openpatrician.model.city.impl.City' and @id] 
    | //ch.sahits.game.openpatrician.model.city.impl.City[@id]