Search code examples
joinselectleft-joinoracle12clistagg

How to concat or Listagg two values in a same column to join it with the values in other column


I have two tables :

TABLE_A

pos_id    res_id     bb_id    bsk_name
1122      10000       1444     type_1 
1122      10000       5678     type_2
1122      10001       1444     type_1
1122      10001       5678     type_2
1122      10002       1467     type_1
1122      10002       5678     type_2
1122      10003       1467     type_1
1122      10003       5678     type_2

table_b

pos_id    row_id    bb_id     bsk_name
1122        1        1444      type_1
1122        1        5678      type_2
1122        2        1467      type_1
1122        2        5678      type_2

I wanted to join the table_a and table_b to get the row_id for each res_id.

the res_id 10000 and 10001 has to have the row_id 1 and res_id 10002 and 10003 has to have the row_id 2. but since there is no unique column to join these two tables i get duplicated values for the bb_id 5678 since they are the same in both row_id's.

So Is there a way like to listagg the bb_id's with the erg_id in table_a and row_id in table_b to join these two tables ?


Solution

  • You can use CAST(COLLECT(...) AS ...) to aggregate the rows into a user-defined collection and then compare the collections.

    First, create a nested-table-type collection:

    CREATE TYPE int_list AS TABLE OF INT;
    

    Then you can use:

    SELECT a.pos_id,
           a.res_id,
           a.bb_id,
           a.bsk_name,
           b.row_id,
           b.bsk_name
    FROM   (
      SELECT a.*,
             CAST(
               COLLECT(bb_id) OVER (PARTITION BY pos_id, res_id)
               AS int_list
             ) AS all_bb_ids
      FROM   table_a a
    ) a
    INNER JOIN (
      SELECT b.*,
             CAST(
               COLLECT(bb_id) OVER (PARTITION BY pos_id, row_id)
               AS int_list
             ) AS all_bb_ids
      FROM   table_b b
    ) b
    ON (    a.pos_id = b.pos_id
        AND a.all_bb_ids = b.all_bb_ids
        AND a.bb_id = b.bb_id)
    

    Which, for your sample data:

    CREATE TABLE TABLE_A ( pos_id, res_id, bb_id, bsk_name ) AS
    SELECT 1122, 10000, 1444, 'type_1' FROM DUAL UNION ALL
    SELECT 1122, 10000, 5678, 'type_2' FROM DUAL UNION ALL
    SELECT 1122, 10001, 1444, 'type_1' FROM DUAL UNION ALL
    SELECT 1122, 10001, 5678, 'type_2' FROM DUAL UNION ALL
    SELECT 1122, 10002, 1467, 'type_1' FROM DUAL UNION ALL
    SELECT 1122, 10002, 5678, 'type_2' FROM DUAL UNION ALL
    SELECT 1122, 10003, 1467, 'type_1' FROM DUAL UNION ALL
    SELECT 1122, 10003, 5678, 'type_2' FROM DUAL;
    
    CREATE TABLE table_b (pos_id, row_id, bb_id, bsk_name) AS
    SELECT 1122, 1, 1444, 'type_1' FROM DUAL UNION ALL
    SELECT 1122, 1, 5678, 'type_2' FROM DUAL UNION ALL
    SELECT 1122, 2, 1467, 'type_1' FROM DUAL UNION ALL
    SELECT 1122, 2, 5678, 'type_2' FROM DUAL;
    

    Outputs:

    POS_ID RES_ID BB_ID BSK_NAME ROW_ID BSK_NAME
    1122 10000 1444 type_1 1 type_1
    1122 10000 5678 type_2 1 type_2
    1122 10001 1444 type_1 1 type_1
    1122 10001 5678 type_2 1 type_2
    1122 10002 1467 type_1 2 type_1
    1122 10002 5678 type_2 2 type_2
    1122 10003 1467 type_1 2 type_1
    1122 10003 5678 type_2 2 type_2

    db<>fiddle here


    Update

    You could also use LISTAGG to aggregate the data; however, if the aggregated string will exceed 4000 bytes then the aggregation will fail (using CAST(COLLECT(...) ...) does not have this limitation but does require you create a collection data type):

    SELECT a.pos_id,
           a.res_id,
           a.bb_id,
           a.bsk_name,
           b.row_id,
           b.bsk_name
    FROM   (
      SELECT a.*,
             LISTAGG(bb_id, ',') WITHIN GROUP (ORDER BY bb_id)
               OVER (PARTITION BY pos_id, res_id) AS all_bb_ids
      FROM   table_a a
    ) a
    INNER JOIN (
      SELECT b.*,
             LISTAGG(bb_id, ',') WITHIN GROUP (ORDER BY bb_id)
               OVER (PARTITION BY pos_id, row_id) AS all_bb_ids
      FROM   table_b b
    ) b
    ON (    a.pos_id = b.pos_id
        AND a.all_bb_ids = b.all_bb_ids
        AND a.bb_id = b.bb_id)
    

    db<>fiddle here