Search code examples

Groovy script code to append xml node taking 15+ hours for 6K employee records

Below code is working but taking 15+ hours to execute 6000 employee records, any improvements possible?

I have two employee record structures (employee data and employee benefits) for each of 6000 employees I have merged them into single xml using personnel number (to check the xml structure please check my previous question -

Now I have to append a node/subnode in xml employee record when ID (personIdExternal in multimap:Message1 finds same ID / PERNR in multimap:Message2.

 xml.'**'.findAll{ == 'EmpEmployment'}.each{ p->

 def perID = xml.'**'.find{it.personIdExternal.text() == p.personIdExternal.text()} 
 def pernr = xml.'**'.find{it.PERNR.text() == '000'+perID.personIdExternal.text()}
 if(pernr != null)
       perID.appendNode {
       erpBenEligibility(pernr.PARDT.text()) }


Sample XML:

<?xml version='1.0' encoding='UTF-8'?>
<multimap:Messages xmlns:multimap="">
     <personIdExternal> 001 </personIdExternal>
     <personIdExternal> 002 </personIdExternal>
     <personIdExternal> 003</personIdExternal>
<rfc:ZHR_GET_EMP_BENEFIT_DETAILS.Response xmlns:rfc="urn:sap- 
  <PERNR> 001 </PERNR>
  <PARDT>#### 1 ####</PARDT>
  <PERNR> 002 </PERNR>
  <PARDT>#### 2 ####</PARDT>
  <PERNR> 003 </PERNR>
  <PARDT>#### 3 ####</PARDT>
</rfc:ZHR_GET_EMP_BENEFIT_DETAILS.Response xmlns:rfc="urn:sap-com:document:sap:rfc:functions">    


  • some major issues in your code:

    • using .** accessors. if you have 10000 persons in message1, then xml.** will return an array with count(person)+count(EmpEmployment)+count(personIdExternal) = 10000*3 elements. and calling findAll on this array should scan all those elements
    • inside the main loop xml.'**'.findAll{ == 'EmpEmployment'}.each{ you are building nested large arrays for no reason. for example after this expression def perID = xml.'**'.find{it.personIdExternal.text() == p.personIdExternal.text()} you have perID equals to p

    your code still does not correspond to the xml sample.

    so, i'm going to make some assumptions to show how you could build gpath without .**.:

    let we have xml like this:

    <?xml version='1.0' encoding='UTF-8'?>
    <multimap:Messages xmlns:multimap="">
            <PARDT>#### 1 ####</PARDT>

    this is a code part to build large xml message:

    def count = 60000 //just for test let's create xml with 60K elements
    def msg = '''<?xml version='1.0' encoding='UTF-8'?>
    <multimap:Messages xmlns:multimap="">
    '''  </person>
            <PARDT>#### ${it} ####</PARDT>
    '''  </phone>

    and now the modified transforming algorithm:

    def xml = new XmlParser().parseText(msg)
    def t = System.currentTimeMillis()
    def ns = new groovy.xml.Namespace('')
    //for fast search let map PERNR value to a node that contains it
    def pernrMap=xml[ns.Message2][0].phone[0].children().collectEntries{ [it.PERNR.text(), it] }
    //itearte msg1 -> find entry in pernrMap -> add node
        def emp = p.EmpEmployment[0]
        def pernr = pernrMap['000'+emp.personIdExternal.text()]
        if(pernr) emp.appendNode('erpBenEligibility', null, pernr.PARDT.text() )
    println "t = ${(System.currentTimeMillis()-t)/1000} sec"

    even for 60k elements in msg1 & msg2 it does transformation in less then 1 sec.