Search code examples
xmlxpathmatchedi

Match node based on preceding node name and all following nodes of same name as the matched node


My problem lies in transforming an EDI file, which has been converted to XML, into groups. It contains a dynamic list of messages, so more UNH tags with following BGM tags and DTM tags. I am not looking for XSLT Stylesheet answers.
In this case, I can have between 1 and 35 DTM tags, ALWAYS following the BGM tag.
There are more DTM tags following other tags, like PTY, but I only want to match any DTM tag following a BGM tag and all next sibling DTM tags.

My final goal is to go through all the segments and add them to groups, like described in http://www.truugo.com/edifact/d96a/orders/, where the UNH-FTX would go into GRP0, the following RFF(mandatory)-DTM(Conditional) into GRP1 and so on. I wish to do this in as few steps as possible in XPATH 1.0.

Thank you in advance!

Illustrative example, where bold = matched nodes. Node count can be up to 35 in a row in this case. In other cases I may have to match up to 200000
<UNH>
  <group>
    <value>1</value>
  </group>
</UNH>
<BGM> ... <BGM>
<DTM> ... </DTM>
<DTM> ... </DTM>

<PAI> ... </PAI>
<DTM> ... </DTM>
<PTY> ... </PTY>
...
<UNH> ... </UNH>
<BGM> ... <BGM>
<DTM> ... </DTM>
<DTM> ... </DTM>
<DTM> ... </DTM>
<DTM> ... </DTM>

<IMD> ... </IMD>
<DTM> ... </DTM>
<DTM> ... </DTM>
<PTY> ... </PTY>


Solution

  • That is really counter-intuitive to do, but here's how to address sets by filtering with sibling operators:

    The first set of DTM immediately after the first BGM

    //DTM [ (preceding-sibling::*)[name()!="DTM"][last()]  
             = //BGM[1]                                     ]
    

    The two DTM following IMD on the second BGM set :

    //DTM [  (preceding-sibling::*)[name()!="DTM"][last()] 
             = //BGM[2]/following-sibling::IMD[1]           ]