Search code examples
esbtalend

How to split a camel message body into rows in order to iterate over them in Talend ESB


So, like the title says, I'm using Talend ESB in order to handle camel messaging. In my case, I'm sending the contents of a file as the message body to the child Talend job. In some scenarios the contents of the file may have 2+ rows. All I need is to be able to iterate over each of those rows independently within the child-job itself.

I guess my question is 2 folded. 1. If possible how do I do this? and 2. is the iteration process better suited at the route level, or the child-job the route calls.

Right now, the files I'm handling are | delimited. To handle this, I have the tRouteInput_1 going directly to a tExtractDelimtedFields and use those values to set variables globally, like so.The beginning of child ESB job.

The problem with this, is it's only reading the first row of the file, and moving on. I need to be able to iterate over each row within the file/camel message.

Thanks, Alex


Solution

  • First you need to split your file on the row delimiter using a tNormalize.
    In my example, I simulate your tRouteInput by using a tFixedFlowInput containing the whole file as a single line, with rows separated by \n. Then for each resulting row returned by tNormalize, extract the fields you want (in tExtractDelimitedFields, create the schema corresponding to your row structure):

    enter image description here

    And the result:

    .--------+--------.
    |    tLogRow_1    |
    |=-------+-------=|
    |field1  |field2  |
    |=-------+-------=|
    |field1.1|field1.2|
    |field2.1|field2.2|
    |field3.1|field3.2|
    '--------+--------'
    

    You need to escape "|" by using "\\|" inside tExtractDelimitedFields, as the component accepts regex, and the pipe has special meaning.

    As for your 2nd question, I think it's better to do this inside the child job and not the route, as there are dedicated components for this not available in the routing perspective.