Search code examples
xmletlpentahokettle

Pentaho - Can it handle manipulation of XML attributes?


I'm looking at using Pentaho/Kettle for data ingestion. I've already run into a problem, but I'm not sure if it's a problem with the tool or just lack of knowledge on my part.

I've figured out how to create a transformation and read data from XML files, which is the first part of my transformation. Unfortunately, my XML is somewhat like this:

<rootnode>
    <category someattribute="cool" rownum="7">
        <firstnode>some data</firstnode>
        <secondnode>more data</secondnode>
    </category>
    <category someattribute="cooler" rownum="8">
        <firstnode>some data II</firstnode>
        <secondnode>more data II</secondnode>
    </category>
</rootnode>

I was using the Input/Get data from XML step, and while I can get it to show all categories and firstnode/secondnode values properly, I can't find any way to even get a glimpse of the attributes rownum and someattribute.

Is Kettle capable of processing XML attributes and allowing you to use them in transformation steps? If so, how, or can someone show me to documentation on the subject? (I can't find any).


Solution

  • Just set the step up to loop on category, and then click get fields. It'll give you all the attributes and indeed child nodes.

    A trick is to put your xml into a file, set the xml input step to read from a file, configure the step, and then revert back to reading the xml from a field.