Search code examples
xpathxmllint

xpath to concatenate two subnodes for each node


I have a XML as follows:

<rss version="2.0">
    <channel>
        <language>en</language>
        <pubDate>Tue, 19 Mar 2024 06:06:35 GMT</pubDate>
        <item>
            <title>Title1</title>
            <category>91021</category>
        </item>
        <item>
            <title>Title2</title>
            <category>91022</category>
        </item>
        <item>
            <title>Title3</title>
            <category>91023</category>
        </item>
    </channel>
</rss>

I want to use a xpath expression that returns a line for each item, concatenating both the title and category, so something like this:

Title1,91021
Title2,91022
Title3,91023

I am limited to use xmllint, so xpath 1.0. How can I do this? I have tried variants of the following, with no success

//item/(concat(title/text(), ".", category/text())

Solution

  • With XPath, you'd need string-join, but this requires 2.0. With concat, you can only compose the first match, not all of them.

    You could, however, use the union operator | to produce each value (in document order), and then use external tools to arrange them, e.g. paste (presuming no line breaks in the values):

    xmllint --xpath '//item/title/text() | //item/category/text()' | paste -d, - -
    
    # or just (if there are no other values conflicting)
    xmllint --xpath '//item/*/text()' | paste -d, - -
    
    Title1,91021
    Title2,91022
    Title3,91023