Search code examples
postgresqlsubqueryrollupmultiple-columns

Postgres bulk update on rows using multi-column subqueries


I need to update all the rows in a table(parent table) with summary values (count, min, max) from child tables on 6 fields on the parent table

tab1 (parent table)
-------------------
tab1_ID 
tab1_low
tab1_high
tab2_ref_count
tab2_ref_low
tab2_ref_high
tab3_ref_count
tab3_ref_low
tab3_ref_high
STUS_CD

tab2 (link table for tab1 to tab1)
----------------------------------
tab1_ID
tab1_ref_ID

tab3 (link table for tab1 to tab4)
----------------------------------
tab1_ID
tab4_ref_ID

tab4
-----
tab4_ID
tab4_low
tab4_high

To roll up counts of tab2 and tab3 into tab1, below is the query I am attempting -

UPDATE tab1
SET    (tab2_ref_count, tab2_ref_low, tab2_ref_high)  = 
                     (SELECT COUNT(t1.tab1_ID), MIN(t1.tab1_low),     MAX(t1.tab1_high)
                      FROM   tab2 t2 JOIN tab1 t1 ON (t2.tab1_ref_ID =     t1.tab1_ID) 
                      WHERE  tab1.tab1_ID = t2.tab1_ID),
       (tab3_ref_count, tab3_ref_low, tab3_ref_high)  = 
                     (SELECT COUNT(t4.tab4_ID), MIN(t4.tab4_low),       MAX(t4.tab4_high)
                      FROM   tab3 t3 JOIN tab4 t4 ON (t3.tab4_ref_ID =     t4.tab4_ID)
                      WHERE  tab1.tab1_ID = t3.tab1_ID)
WHERE  STUS_CD IN ('01','02')

But apparently it's not working. Any tips please ?


Solution

  • Could work like this:

    UPDATE tab1 t1
    SET    tab2_ref_count = t2.ct
         , tab2_ref_low   = t2.low
         , tab2_ref_high  = t2.high
         , tab3_ref_count = t3.ct
         , tab3_ref_low   = t3.low
         , tab3_ref_high  = t3.high
    FROM  (
        SELECT t2.tab1_id
             , count(*)          AS ct
             , min(t1.tab1_low)  AS low
             , max(t1.tab1_high) AS high
        FROM  tab2 t2
        JOIN  tab1 t1 ON t1.tab1_id = t2.tab1_ref_id 
        GROUP BY 1
        ) t2
    JOIN  (
        SELECT t3.tab1_id
             , count(*)          AS ct
             , min(t4.tab1_low)  AS low
             , max(t4.tab1_high) AS high
        FROM   tab3 t3
        JOIN   tab4 t4 ON t4.tab4_id = t3.tab4_ref_id 
        GROUP  BY 1
        ) t3 USING (tab1_id)
    WHERE  stus_cd IN ('01','02')
    AND    t1.tab1_id = t2.tab1_id;
    -- AND    t1.tab1_id = COALESCE(t2.tab1_id, t3.tab1_id);  .. for FULL OUTER JOIN
    

    Major points:

    • I use count(*) instead of count(t4.tab1_id). Same result guaranteed here, but simpler, faster.

    • The query assumes that there are rows in tab3 for every tab1_id that is in tab2, too. If that is not the case, you have to change the JOIN type to LEFT JOIN or FULL OUTER JOIN. Use the provided alternative in the final WHERE clause in this case.

    • This query will not update rows in tab1 at all, if not related rows are found in either tab2 or tab3.