I have a xml grouping challenge for which I need to group AND remove duplicate as below:
<Person>
<name>John</name>
<date>June12</date>
<workTime taskID=1>34</workTime>
<workTime taskID=1>35</workTime>
<workTime taskID=2>12</workTime>
</Person>
<Person>
<name>John</name>
<date>June13</date>
<workTime taskID=1>21</workTime>
<workTime taskID=2>11</workTime>
<workTime taskID=2>14</workTime>
</Person>
Note that for a specific occurence of name/taskID/date, only the first one is picked up. In this example,
<workTime taskID=1>35</workTime>
<workTime taskID=2>14</workTime>
would be left aside.
Below is the expected output:
<Person>
<name>John</name>
<taskID>1</taskID>
<workTime>
<date>June12</date>
<time>34</time>
</worTime>
<workTime>
<date>June13</date>
<time>21</time>
</worTime>
</Person>
<Person>
<name>John</name>
<taskID>2</taskID>
<workTime>
<date>June12</date>
<time>12</time>
</worTime>
<workTime>
<date>June13</date>
<time>11</time>
</worTime>
</Person>
I have tried to use a muenchian grouping in XSLT 1.0 using the key below:
<xsl:key name="PersonTasks" match="workTime" use="concat(@taskID, ../name)"/>
but then how do I only pick up the first occurence of
concat(@taskID, ../name, ../date)
? It seems that I need two level of keys!?
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kwrkTimeByNameTask" match="workTime"
use="concat(../name, '+', @taskID)"/>
<xsl:key name="kDateByName" match="date"
use="../name"/>
<xsl:key name="kwrkTimeByNameTaskDate" match="workTime"
use="concat(../name, '+', @taskID, '+', ../date)"/>
<xsl:template match="/">
<xsl:for-each select=
"*/*/workTime
[generate-id()
=
generate-id(key('kwrkTimeByNameTask',
concat(../name, '+', @taskID)
)[1]
)
]
">
<xsl:sort select="../name"/>
<xsl:sort select="@taskID" data-type="number"/>
<xsl:variable name="vcurTaskId" select="@taskID"/>
<Person>
<name><xsl:value-of select="../name"/></name>
<taskID><xsl:value-of select="@taskID"/></taskID>
<xsl:for-each select=
"key('kDateByName', ../name)
[key('kwrkTimeByNameTaskDate',
concat(../name, '+', current()/@taskID, '+', .)
)
]
">
<workTime>
<date><xsl:value-of select="."/></date>
<time>
<xsl:value-of select=
"key('kwrkTimeByNameTaskDate',
concat(../name, '+', $vcurTaskId, '+', .)
)"/>
</time>
</workTime>
</xsl:for-each>
</Person>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML (corrected from multiple issues to become well-formed):
<t>
<Person>
<name>John</name>
<date>June12</date>
<workTime taskID="1">34</workTime>
<workTime taskID="1">35</workTime>
<workTime taskID="2">12</workTime>
</Person>
<Person>
<name>John</name>
<date>June13</date>
<workTime taskID="1">21</workTime>
<workTime taskID="2">11</workTime>
<workTime taskID="2">14</workTime>
</Person>
</t>
produces the wanted, correct result:
<Person>
<name>John</name>
<taskID>1</taskID>
<workTime>
<date>June12</date>
<time>34</time>
</workTime>
<workTime>
<date>June13</date>
<time>21</time>
</workTime>
</Person>
<Person>
<name>John</name>
<taskID>2</taskID>
<workTime>
<date>June12</date>
<time>12</time>
</workTime>
<workTime>
<date>June13</date>
<time>11</time>
</workTime>
</Person>
Explanation:
First we obtain all workTime
elements with unique pairs of ../name
, @taskID
by using the Muenchian method for grouping.
We sort these by ../name
and @taskID
-- in that order.
For each such workTime
we get all date
elements that are listed with the ../name
of this workTime
and leave only those of these date
elements, for which there is a workTime
that has the same ../date
and ../name
.
In the previous step we use two different auxiliary keys: 'kDateByName'
indexes all date
elements by their ../name
, while 'kwrkTimeByNameTaskDate'
indexes all workTime
elements by their ../name
, their ../date
and their @taskID
.
So, the meaning of the following:
<xsl:for-each select=
"key('kDateByName', ../name)
[key('kwrkTimeByNameTaskDate',
concat(../name, '+', current()/@taskID, '+', .)
)
]
">
is:
For each date
for that name
, such that a workTime
for that name
, date
and @taskID
(of the current workTime
for the outer <xsl:for-each>
) exists, do whatever is in the body of this <xsl:for-each>
instruction.