Search code examples
datastageibm-infosphere

running job with different file without reloading the file


I created a job that could be reusable for new files. The entire activities in the job, the maps and everything else will remain the same except for the file name. I already tried it once but it seems that i need to re "load" the file and remap everything again. It's inefficient. Is there any way for me to pass different file in a job without remaping, reconfiguring and reloading anything?


Solution

  • You have multiple options for allowing a DataStage parallel job to use a different filename for input on each job run:

    1. When using either Sequential File stage or File Connector stage, in stead of typing the actual filename, you can input the name of a job parameter which has been defined on the Parameters tab of the job properties dialog. For example, if you define string parameter myFile, then in the filename field of input stage you would enter #myFile# and at job run-time that would be replaced by whatever is the current value of the myFile parameter. If you run job manually from Director/Designer clients, you will have job run dialog where you can specify a value for job parameters. If you start job via dsjob command, there are options to pass in job parameters on command line. You also have option to use parameterset files that you can modify prior to job run.

    2. Another option would be to use a file location and pattern instead of a specific file name. Both Sequential File stage and File Connector stage let you specify a pattern, for example: /data/my_input_files/*.txt Then, each time you run job it will input any files at that location matching the above pattern, so it can process multiple files. However, to prevent re-processing files from prior job runs, you will want to clean up any files at that location after job completes. Then when you have new files to process just put them in that directory and re-run the job.