Search code examples
xmlxmllint

Extracting multiple nodes from an XML document with xmllint


I am trying to use xmllint in order to extract multiple nodes under multiple parent nodes called //item, as follows:

<item>
        <title>A title</title>
        <link>http://www.example.com</link>
        <pubDate>Mon, 08 Aug 2016 09:04:11 +0000</pubDate>
        <dc:creator><a name></dc:creator>
        <location><a name></dc:creator>
</item>

I would normally do this if I simply want to extract of node (for example title):

xmllint --shell myXml.xml 

and then cat //item/title, this will only retrieve all the title tags and their values. Can i use xmllint to get a subset of nodes, for example:

        <title>A title</title>
        <link>http://www.example.com</link>
        <pubDate>Mon, 08 Aug 2016 09:04:11 +0000</pubDate>

Thank you,


Solution

  • Here is an alternative XPath to get multiple elements of different name using self axis and union (|) operator :

    cat //item/*[self::title|self::link|self::pubDate]
     -------
    <title>A title</title>
     -------
    <link>http://www.example.com</link>
     -------
    <pubDate>Mon, 08 Aug 2016 09:04:11 +0000</pubDate>