Search code examples
talend

How can I count multiple foreign keys and group them in a row?


I have one table with the following structure:

ID|FK1|FK2|FK3|FK4
ID|FK1|FK2|FK3|FK4
ID|FK1|FK2|FK3|FK4

And another table that holds:

FK|DATA
FK|DATA
FK|DATA
FK|DATA

The FKn columns in the first table references the FK field in the second one. There can be more than one record linked between the first table and the second one.

What I want to achieve is to create another table with the total number of records of every FKn linked. For example:

ID|FK1|FK2|FK3|FK4
 A|0  |23 |9  |3
 B|4  |0  |2  |0

I know how to transform the row flow and iterate over every FKn field. I also know how to count. What Im not able to do is to group every FKn count from the same ID into one row, because after I use a tLoop component, every count operation is transformed into a new row like:

FK|count
FK|count
FK|count
FK|count
...

Any idea about how to join rows by packing N of them into one single row each time? Or is there another way to do it?

NOTE: I'm using text data as input


Solution

  • if i understood your issue then i would suggest a different way as below (provided you have fixed number of FK1, FK2, FK3, FK4)

    tFileInput-->Tmap (left join to lookup tAnotherTable(FK DATA) on FK1) -->output-1 i will have four columns - ID, FK1=(0 or 1 - 0 if no matching row is found, 1 if matching row is found), FK2=0, FK3=0, FK4=0. This could be a case where for same ID we can get many FK1 values as you mentioned there can be more than one rows)..

    Similarly i will have tFileInput-->Tmap (left join to lookup tAnotherTable(FK DATA) on FK2) -->output-2 i will have four columns - ID, FK1=0, FK2=(0 or 1 - 0 if no matching row is found, 1 if matching row is found), FK3=0, FK4=0.

    tFileInput-->Tmap (left join to lookup tAnotherTable(FK DATA) on FK3) -->output-3 i will have four columns - ID, FK1=0, FK2=0, FK3=(0 or 1 - 0 if no matching row is found, 1 if matching row is found), FK4=0. ..... ... Next i will Union all these Output-1, output-2, output-3, output-4 into a final result set say union all ResultUnionall and will move them to tagg and group by ID column and take SUM(FK1), SUM(FK2), SUM(FK3)...

    to summarize you job will look something as below
    
    tFileInput--->tmap(withlookup)-->tHash1/tFileOutput1
    tFileInput--->tmap(withlookup)-->tHash1/tFileOutput2
    tFileInput--->tmap(withlookup)-->tHash1/tFileOutput3
    tFileInput--->tmap(withlookup)-->tHash1/tFileOutput4
    
    
    tHashInput1/tFIleInput1---
    tHashInput1/tFIleInput1---
    tHashInput1/tFIleInput1----
    tHashInput1/tFIleInput1---  tUnite--->tAgg-->finaloutput.