Search code examples
postgresqlpgadmin

hep with CTE in Postgresql - join tables


i have following query.

with sh as
(select payor, name_policy, program, count(distinct event_id) as sevents from cart_item_funder_policy_worker where event_Status like 'Show%' 
    and event_time between '2022-04-01' and '2022-09-30' group by payor, name_policy, program),
qty_s as
(select payor, name_policy, program, sum(item_qty) as qty_show from cart_item_funder_policy_worker where event_Status like 'Show%' 
    and event_time between '2022-04-01' and '2022-09-30' group by payor, name_policy, program),
ns as
(select payor, name_policy, program, count(distinct event_id) as nsevents from cart_item_funder_policy_worker where event_Status like 'No Show%' 
    and event_time between '2022-04-01' and '2022-09-30' group by payor, name_policy, program),
qty_ns as
(select payor, name_policy, program, sum(item_qty) as qty_no_show from cart_item_funder_policy_worker where event_Status like 'No Show%' 
    and event_time between '2022-04-01' and '2022-09-30' group by payor, name_policy, program)
    
select sh.payor, sh.name_policy, sh.program, sh.sevents, qty_s.qty_show, ns.nsevents, qty_ns.qty_no_show
from sh
full join ns on sh.name_policy = ns.name_policy
full join qty_s on qty_s.name_policy = sh.name_policy
full join qty_ns on qty_ns.name_policy = ns.name_policy;

as you can see that i only want to see the (count of event_id and sum of item_qty) grouped by payor, name_policy, program. but the above query groups the result by sh.sevents, qty_s.qty_show, ns.nsevents, qty_ns.qty_no_show as well.

can you please provide any assistance in figuring out how i can avoid grouping of sh.sevents, qty_s.qty_show, ns.nsevents, qty_ns.qty_no_show in my query.

Table definition:

CREATE TABLE cart_item_funder_policy_worker
( payor          VARCHAR
, name_policy    VARCHAR
, program        VARCHAR
, item_qty       INTEGER
, event_id       INTEGER
, event_date     DATE
, event_status   VARCHAR
);

thanks


Solution

  • It looks like you're nearly there and you can simplify your CTEs to perform both the COUNT and SUM aggregates over the same data.

    There doesn't appear to be a need for a FULL JOIN here. When will you have data in sh or ns that doesn't exist in cart_item_funder_policy_worker? LEFT JOIN will produce a row in the result set where you have data in cart_item_funder_policy_worker but not in sh or ns. Or just INNER JOIN/JOIN if you don't care about joining against NULL.

    I didn't test it but I believe you can get the aggregates down to one block with something like this:

    WITH data AS
    (
        SELECT payor
             , name_policy
             , program
             , item_qty
             , event_id
             , event_status
          FROM cart_item_funder_policy_worker
         WHERE event_time BETWEEN '2022-04-01' AND '2022-09-30'
    )
    , aggs AS
    (
        SELECT d.name_policy
             , COUNT(DISTINCT d.event_id) FILTER (WHERE d.event_status LIKE 'Show%') AS sevents
             , COUNT(DISTINCT d.event_id) FILTER (WHERE d.event_status LIKE 'No Show%') AS nsevents
             , SUM(d.item_qty) FILTER (WHERE d.event_status LIKE 'Show%') AS qty_show
             , SUM(d.item_qty) FILTER (WHERE d.event_status LIKE 'No Show%') AS qty_no_show
          FROM data d
         GROUP BY d.payor, d.name_policy, d.program
    )
    SELECT DISTINCT 
           d.payor
         , d.name_policy
         , d.program
         , a.sevents
         , a.qty_show
         , a.nsevents
         , a.qty_no_show
      FROM data d
      LEFT JOIN aggs a ON d.name_policy = a.name_policy
    ;
    

    Or something like this:

    SELECT all of the data you're going to be aggregating over in the first CTE, then calculate your aggregates in a separate CTE. In your aggregate portions, SELECT only your JOIN fields and the aggregates. Finally, grab your desired result set.

    WITH data AS
    (
        SELECT payor
             , name_policy
             , program
             , item_qty
             , event_id
             , event_status
          FROM cart_item_funder_policy_worker
         WHERE event_time BETWEEN '2022-04-01' AND '2022-09-30'
    )
    , sh AS
    (
        SELECT d.name_policy
             , COUNT(DISTINCT d.event_id) AS sevents
             , SUM(d.item_qty) AS qty_show
          FROM data d
         WHERE d.event_status LIKE 'Show%'
         GROUP BY d.payor, d.name_policy, d.program
    )
    , ns AS
    (
        SELECT d.name_policy
             , COUNT(DISTINCT d.event_id) AS nsevents
             , SUM(d.item_qty) AS qty_no_show
          FROM data d
         WHERE d.event_status LIKE 'No Show%'
         GROUP BY d.payor, d.name_policy, d.program
    )
    SELECT DISTINCT 
           d.payor
         , d.name_policy
         , d.program
         , sh.sevents
         , sh.qty_show
         , ns.nsevents
         , ns.qty_no_show
      FROM data d
      LEFT JOIN sh ON d.name_policy = sh.name_policy
      LEFT JOIN ns ON d.name_policy = ns.name_policy
    ;