Search code examples
xmltalend

Talend XML metadata with multiple similar tags within the loop


I was wondering what would be the best way to create metadata of an XML file with this format:

<?xml version="1.0" encoding="utf-8"?>
<django-objects version="1.0">
    <object pk="8" model="auth.user">
        <field type="CharField" name="username">jd</field>
        <field type="CharField" name="first_name">John</field>
        <field type="CharField" name="last_name">Doe</field>
        <field type="CharField" name="email">[email protected]</field>
    </object>
    <object pk="2102684" model="auth.user">
        <field type="CharField" name="username">kr</field>
        <field type="CharField" name="first_name">Karl</field>
        <field type="CharField" name="last_name">Row</field>
        <field type="CharField" name="email">[email protected]</field>
    </object>
  .... etc
</django-objects>

The problem here is that the tag repeats multiple times (instead of having a separate <username>, <first_name>, etc tag), which causes the default metadata mapping to only return the first occurance (username).

How can I best map this kind of data?

Thanks koen


Solution

  • Ok, this was a newbee question I guess: I figured it out. I just need to use XPATH to map the fields to new columns. For instance:

        <field type="CharField" name="username">jd</field>
    

    maps like this using a tFileInputMSXML component:

        "field[@name='username']". 
    

    Every field tag is mapped like this. Works like a charm.