Search code examples
pentahokettle

Dummy step is not work in Job


Each transformation will create an csv file in a folder, and I want to upload all of them when transformations done. I add a Dummy but the process didn't work as my expectation. Each transformation will execute Hadoop Copy Files step. Why? And how could I design the flow? Thanks.

enter image description here


Solution

  • First of all, if possible, try launching the .ktr files in parallel (right click on the START Step > Click on Launch Next Entries in parallel). This will ensure that all the ktr are launched parallely.

    Secondly, You can choose either of the below steps depending upon your feasibility (instead of dummy step):

    1. "Checks if files exist" Step: Before moving to the Hadoop step, you can do a small check if all the files has been properly created and then proceed with your execution.
    2. "Wait For" Step: You can give some time to wait for all the step to complete before moving to the next entry. I don't suggest this since the time of writing a csv file might vary, unless you are totally sure of some time.
    3. "Evaluate files metrics" : Check the count of the files before moving forward. In your case check if the file count is 9 or not.

    I just wanted to do a some sort of checking on the files before you copy the data to HDFS.

    Hope it helps :)