Search code examples
groovyxmlstarlet

How can I use xmlstarlet in groovy?


I have a bash script that uses xmlstarlet to process some xml. I need to port bash script to Groovy (to use a Jenkins Pipeline) and I am having problems in the xml processing part. I know GPath can be used but I am interested in using xmlstarlet if possible. This is a simplification of my bash script:

count=$(xmlstarlet sel -t -v "count(//Result/Dataset[@name='PackageMap_Dataset']/Row)" /tmp/DataPipeLineScript-output-step4.xml)
echo count

In order to achieve it I tried this Groovy but script gets wrong output:

def count="xmlstarlet sel -t -v \"count(//Result/Dataset[@name='PackageMap_Dataset']\" DataPipeLineScript-output-step4.xml".execute().text
println "Number of detected DataSets: " + count

The simplified version with no @name has the same problem:

def count="xmlstarlet sel -t -v 'count(//Result/Dataset' DataPipeLineScript-output-step4.xml".execute().text
println "Number of detected DataSets: " + count

Even this simply execution is failing, giving no output:

println "xmlstarlet".execute().text
def count= "xmlstarlet".execute().text
println "Number of detected DataSets: " + count

How can I make this work?

For reference here is xml

<Result>
<Dataset name='PackageMap_Dataset'>
  <queryname>getcoordmissinglocations</queryname>
 <Row>
  <superfilename>~foo::indexes::develop::LocationsToEnrich::Super</superfilename>
  <indexfilename>~foo::indexes::develop::LocationsToEnrich_20160912_143427</indexfilename>
 </Row>
 <Row>
    <queryname>getcoordmissingsoiltype</queryname>
    <superfilename>~foo::indexes::develop::SoilTypesToEnrich::Super</superfilename>
    <indexfilename>~foo::indexes::develop::SoilTypesToEnrich_20160912_143427</indexfilename>
</Row>
 <Row>
    <queryname>getngrmissinglatlong</queryname>
    <superfilename>~foo::indexes::develop::LatLongsToEnrich::Super</superfilename>
    <indexfilename>~foo::indexes::develop::LatLongsToEnrich_20160912_143427</indexfilename>
</Row>
</Dataset>
<Dataset name='Result 2'>
</Dataset>
<Dataset name='Result 3'>
</Dataset>
<Dataset name='Result 4'>
</Dataset>
</Result>

UPDATE I have updated code to show errors as suggested by @cfrick

        //def proc = "xmlstarlet sel -t -v 'count(//Result/Dataset)' DataPipeLineScript-output-step4.xml".execute();
        def proc = ["xmlstarlet", "sel", "-t", "-v", "\"count(//Result/Dataset)\"","DataPipeLineScript-output-step4.xml"].execute()
        def outputStream = new StringBuffer();
        def errorStream = new StringBuffer();
        proc.waitForProcessOutput(outputStream, errorStream);
        println("OUTPUT: " + outputStream.toString());
        println("ERROR: " + errorStream.toString());

enter image description here


Solution

  • Groovy is just spawning new processes with arguments - and is no shell. So there is no need to quote params (which shells need to leave params alone). In this case the quoting of the query for xmlstarlet will make it think, one just wants a constant string back (see the OUTPUT: count(//Result/Dataset) in the question).

    So just use [].execute() for proper argument separation, don't quote for a shell, don't use shell features (like piping, redirection, ...):

    ["xmlstarlet", "sel", "-t", "-v", "count(//Result/Dataset)", "DataPipeLineScript-output-step4.xml"].execute()
    

    If you need "shellisms", use

    ["sh", "-c", "... | ... > 'quote weird files.txt' ..."].execute()
    

    or how your shell is dealing with such an "eval me that line" scenario.