For example, given XML:
<root>
<item>
<id>111</id>
<description>aisle 12, shelf 3</description>
<description>inside the box</description>
</item>
</root>
I would like the result:
<root>
<item>
<id>111</id>
<description>aisle 12, shelf 3 inside the box</description>
</item>
</root>
But the node may have any name, and be at any level. I would like the same query to work with different XML, as long as the tag is repeated:
<root>
<item>
<id>112</id>
<attributes>
<author>Joe Smith</author>
<author>Arthur Clarke</author>
<author>Jeremiah Wright</author>
</attributes>
</item>
</root>
Output:
<root>
<item>
<id>112</id>
<attributes>
<author>Joe Smith Arthur Clarke Jeremiah Wright</author>
</attributes>
</item>
</root>
Is this possible with BaseX ? If not, can we do this given a known element (for example, only for /root/item/attributes/author)?
Ensuring to only merge directly following siblings complicates things a little bit. I added some comments on how the code is working below.
let $xml := document{<root>
<item>
<id>112</id>
<attributes>
<author>Joe Smith</author>
<author>Arthur Clarke</author>
<author>Jeremiah Wright</author>
<foo/>
<author>Donald Duck</author>
</attributes>
</item>
</root>}
return
(: Use an XQuery Update transformation :)
copy $copy := $xml
modify (
(: Loop over all leaves (only containing text nodes. :)
(: This might have to be adjusted if you want to merge arbitrary nodes. :)
for $leaf in $copy//*[not(*)]
(: Where the preceding node is not of the same name :)
(: (as it will be merged anyway) :)
where not($leaf/preceding-sibling::*[1 and name(.) eq name($leaf)])
(: Now find following siblings... :)
let $siblings := $leaf/following-sibling::*[
(: ... of the same name ... :)
name(.) eq name($leaf) and
(: ... and that do not have a node with another name in-between :)
not(preceding-sibling::*[name(.) != name($leaf) and $leaf << .])
]
return (
(: Merge text contents into $leaf :)
replace value of node $leaf with string-join(($leaf, $siblings), ' '),
(: And delete all others :)
delete nodes $siblings
)
)
return $copy