Search code examples
apache-nifi

How to simply combine flow files in nifi?


Let's say I have 100 flow files produced by one processor, each of them contains a different line. I want to get a new flow file which contains 100 line. How can I did that?

I have tried MergeContent processor, but it gives me the origin 100 flow files back.

Current config:

enter image description here

Update:

I debugged the output of MergeContent, in the first step JOIN, it seems ok since the data is 576.34 KB which contains 100 line. But the second step ATTRIBUTES_MODIFIED it seems only output 1 line to the final result.

enter image description here

Update:

This is my whole procedure.

  1. Get from kafka one by one.
  2. Convert kafka message to one line string in one flow file.
  3. Merge multiple flow files into one.
  4. PutHDFS.

Now I'm stuck at step 3, I can not merge them one by one. I don't care the order or the attribute, I just need limit the number.

Update:

I have try to set correlation attribute to ${kafka.topic} since all the flow files from the same kafka topic, but they still can not merge:

enter image description here


Solution

  • Are you using the original or merged relationships from the MergeContent processor? The former will provide the same 100 flowfiles back to you in case you need to do additional processing; the latter will give you a single flowfile with the contents of all the merged flowfiles. It looks from your provenance listing that the merge event is happening successfully, so double check which relationships you are using. If possible, please post a screenshot of your flow.