Search code examples
xmlxpathxmlstarlet

xpath - show specific line (last) of a multiline node text


Is it possible to show only a specific line of some node multiline content? I am parsing some junit.xml, and i want to show the last line of an error node. So for example for this data:

<testsuite>
<testcase>
<error message="test setup failure">some
lines 
of a lenghty
stacktrace
    s.shutdown(socket.SHUT_RDWR)
E   OSError: [Errno 107] Transport endpoint is not connected</error>
</testcase>
<testcase>
<error message="test setup failure">some
lines 
of a lenghty
stacktrace
    raise Exception(&quot;Connection closed by remote!&quot;)
E   Exception: Connection closed by remote!</error>
</testcase>
</testsuite>

I'd like to show only

E   OSError: [Errno 107] Transport endpoint is not connected
E   Exception: Connection closed by remote!

getting the whole text is trivial by either

/testsuite/testcase/error/text()
//error/text()

I tried below xpath expressions:

//error/text()[last()]

but they dnt work. I was able to achieve what I wanted using xmlstarlet like so:

xmlstarlet sel -t -m "//error" -v "substring-after(., 'E   ')" -n

but I was wondering if is something like that possible to be achieved by pure XPath expression?


Solution

  • Since xmlstarlet only supports XPath 1.0, but does support EXSLT extension functions, what I would recommend is using str:tokenize() to tokenize the text and return the last token.

    The EXSLT extension functions are supported in XPath using the "sel" command (-m and -v), but I was not able to get it to work. I think this is because of the way xmlstarlet creates the internal XSLT that is used.

    I was able to get it to work using the "tr" command though by creating my own XSLT...

    XSLT 1.0 (test.xsl)

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
      xmlns:exslt="http://exslt.org/common" 
      xmlns:str="http://exslt.org/strings" version="1.0" extension-element-prefixes="exslt str">
      <xsl:output omit-xml-declaration="yes" indent="no"/>
    
      <xsl:template match="/">
        <xsl:for-each select="//error">
          <xsl:if test="not(position()=1)">
            <xsl:text>&#xA;</xsl:text>
          </xsl:if>
          <xsl:value-of select="str:tokenize(.,'&#xA;')[last()]"/>
        </xsl:for-each>  
      </xsl:template>
    
    </xsl:stylesheet>
    

    xmlstarlet command line

    xmlstarlet tr test.xsl input.xml
    

    Output

    E   OSError: [Errno 107] Transport endpoint is not connected
    E   Exception: Connection closed by remote!
    

    If you're using something that supports XSLT 2.0, you could do something like this:

    for $err in //error return tokenize($err, '&#xA;')[last()]
    

    Note: You may have to change &#xA; to \n in some of the online XPath testers.