Search code examples
xmlgroovyxml-parsingduplicatesxmlslurper

Filter xml element removing duplicated child node with groovy xml slurper


I need to filter an xml list so to get all the "element" with no duplicates. Input:

<List>
<element>
    <field1>A</field1>
    <field2>1</field2>
</element>
<element>
    <field1>B</field1>
    <field2>2</field2>
</element>
<element>
    <field1>B</field1>
    <field2>2</field2>
</element>
output
<List>
<element>
    <field1>A</field1>
    <field2>1</field2>
</element>
<element>
    <field1>B</field1>
    <field2>2</field2>
</element>
How is it possibile to achieve this with XmlSlurper or XmlParser in groovy? Thank you.

Solution

  • Something like this is an option:

    import groovy.xml.* 
    
    def data = '''
    <List>
    <element>
        <field1>A</field1>
        <field2>1</field2>
    </element>
    <element>
        <field1>B</field1>
        <field2>2</field2>
    </element>
    <element>
        <field1>B</field1>
        <field2>2</field2>
    </element>
    </List>'''
    
    def xml = new XmlParser().parseText(data)
    
    def writer = new StringWriter() 
    def result = new MarkupBuilder(writer).List { 
      xml.element.unique { 
        it.field1.text()
      }.each { n ->
        element { 
          field1(n.field1.text())
          field2(n.field2.text())
        }
      }
    }
    
    println writer
    

    which when run prints:

    ─➤ groovy solution.groovy
    <List>
      <element>
        <field1>A</field1>
        <field2>1</field2>
      </element>
      <element>
        <field1>B</field1>
        <field2>2</field2>
      </element>
    </List>