Search code examples
xmlxpathxmllint

xmllint match and extract value


I would like to pull only "image" from the xml but I'm getting an error since the first couple ones don't include "image".

<?xml version='1.0' encoding='utf-8'?>
<document>
    <job name="Job1">
        <type>
            <description>
            </description>
        </type>
    </job>
    <job name="Job2">
        <type>
            <description>
            </description>
        </type>
    </job>
    <job name="Job3">
        <type>
            <description>
                <image>
                    <png></png>
                </image>
            </description>
        </type>
    </job>
</document>

How can it skip the first two and match the third image tag?

xmllint --xpath "//*[local-name()='document']/job/type/description/image/png/text()" file

Solution

  • If I modify only one line of your input file, changing

    <png></png>
    

    to

    <png>Some text goes here</png>
    

    ...then your code works perfectly as already written. Thus, the error has nothing at all to do with the first few descriptions having no png.


    Mind you, since you don't have any namespaces defined, you don't need to mess with local-name() at all.

    xmllint --xpath "/document/job/type/description/image/png/text()"
    

    ...works just as well.