Search code examples
apache-nifi

how merge multiple flowfile by value of attribute with NiFi


I have a text file with 500 rows of data which composes the data of three tables,

Table_LCP            PX37110286387+0002186861MML30508808      23210471S     +0702492378+0702492378+0702492378D+00350     082023-04-262023-04-202023-04-202023-04-26          +000000000000000+000000000000000+00001302EUR+000000000044233+000000000044233+000000000044233+000.0000  2023-05-23                    21             N          C50ED50 C50BB50 2023-05-231S+0000000000+0601141825+0000001000                    +000000000000000+000000000000000               0   0F_AV                 5FX7SI+0702492378                                                                                                                                                                                                                                                                                                                                                                                                                               
Table_LCP            UI37111286592+0002186911PPL30508900      22037464S     +0702373077+0702373077+0702373077D+00350     082023-04-272023-04-212023-04-212023-04-27          +000000000000000+000000000000000+00001302EUR+000000000036075+000000000036075+000000000036075+000.0000  2023-05-16                    11             N          C50ED50 CF82H50 2023-05-231S+0000000000+0601129451+0000001000                    +000000000000000+000000000000000               0   0FATT_AV                 5FX7SI+0702373077                                                                                                                                                                                                                                                                                                                                                                                                                               
Table_CCL            +0002589171BB+0702528071DL      0001-01-01EUR+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000+000000000000000     1+0000000000000000001-01-01+0702528071                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
Table_LDC            PP37143489207+0000+0002045017JHL30633879      914902866  D+00551  2023-05-292023-05-29          +000000000000000+0000000000000003JUI+000000000003800+000000000003167+000000000000000+000.00000                C50ED50 C50ED50 2023-05-232023-05-23+00000S          0   0FUYR_AV   082023-05-2943 3           +005                                 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
Table_LDC            PX37143489208+0000+0002050787RRL30633880      23212728S  D+00350  2023-05-292023-05-29          +000000000010754+0000000000000003XXX+000000000010754+000000000010754+000000000010754+000.00001                C50ED50 C50ED50 2023-05-23          +00000S          0   0FVFR_AV   082023-05-295              +005                                 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      

I split the file to have 500 flowfile, for each row I assigned a value enter image description here which is the name of the table, now I want to merge all the flowfiles to finally have 3 flowfiles i.e. each flowfile composes the data of each table, but the problem that I can't merge correctly using the MergeContent processor by attribute value here is that I have: enter image description here

enter image description here

Thank you for help


Solution

  • You don't need to split the original FlowFile for this. Instead use the PartitionRecord processor with a user-defined property (say table.name) set to the RecordPath pointing at the first column (if the column is id use /id. That will output 3 FlowFiles, with all the rows for each of the tables.