Search code examples
pentahoetlkettle

Generate for each MS Excel line a separate XML file with Pentaho Kettle


I have an Excel file with list of items: Column A - ID, B - Name.

For example (2 lines):

Line 1: A - 12000; B - "Name of the first item"
Line 2: A - 12001; B - "Name of the second item"

I need to go through all lines and create for each a file with name ID.xml. For above example I want to have 2 files in the output folder:

12000.xml

<?xml version="1.0" encoding="utf-8"?>
<item>
  <property key="ID" value="12000"/>
  <property key="name" value="Name of the first item"/>
</item>

12001.xml

<?xml version="1.0" encoding="utf-8"?>
<item>
  <property key="ID" value="12001"/>
  <property key="name" value="Name of the second item"/>
</item>

How can I achieve it with Pentaho Kettle ETL tool?

Any help is appreciated.


Solution

  • If the XML structure is as simple as you put here, the simplest way is to just build your XML in a Javascript step, generate the filename as well, and then use the Text file output step, with the "accept filename from previous step" box checked.

    This will output each row of data in a separate file.

    If your structure is more complex than that, then you'll probably need to use several Add XML steps together with some XML Joins.

    There is a XML join sample in PDI's samples folder.