How does indirect load in informatica work internally. Does it collate all the data and then process the data or it does processing for one file at a time? If I have duplicates spanning multiple files, will the duplicate removal logic in my mapping would remove duplicates or would I have to merge the files using Union transformation and then process the data in the duplication removal logic?
Informatica reads a stream as if it was a single file. It's like you'd do a cat
on filename with wildcard, eg. if there are two files f1.txt
with a testlineA
inside and f2.txt
with a testlineB
inside, and you run a cat f*.txt
command, you should get:
testlineA
testlineB
Just like if it was coming from one file.