Search code examples
sqlsql-serverwindow-functions

How to sum OVER (PARTITION BY DISTINCT) for Distinct Values


I am looking for a clever way of using Partition by Over in SQL Server.

I have 3 tables in SQL Server (all *_id columns below are just pseudo primary key)

  • PO (po_id, po_no);
  • PO_ITEM (po_item_id, po_id, po_item_no, qty); // stores ordered quantity for a PO ITEM
  • PO_ITEM_DELY (po_item_dely_id, po_item_id, dely_no, dely_qty); // stores delivered quantity for every item of PO per delivery no.
select
    po.po_no, pt.po_item_no, pt.qty, pd.dely_no, pd.dely_qty
from 
    PO
inner join 
    PO_ITEM pt on pt.po_id = po.po_id
inner join 
    PO_ITEM_DELY pd on pd.po_item_id = pt.po_item_id
where 
    po.po_no = 'PO1'

Result of this SQL query for reference:

po_no po_item_no qty dely_no dely_qty
PO1 PoI11 300 1 210
PO1 PoI11 300 2 48
PO1 PoI11 300 3 55
PO1 PoI12 100 1 100
PO1 PoI13 250 1 150
PO1 PoI13 250 2 100

So in this example, the Total Ordered Quantity for PO1 is 650, but Total Delivered Qty is 663.

Desired result:

po_no OrdPOQty DelyPOQty po_item_no OrdItemQty delyItemQty dely_no dely_qty
PO1 650 663 PoI11 300 313 1 210
PO1 650 663 PoI11 300 313 2 48
PO1 650 663 PoI11 300 313 3 55
PO1 650 663 PoI12 100 100 1 100
PO1 650 663 PoI13 250 250 1 150
PO1 650 663 PoI13 250 250 2 100

Now I am able to do this task by using subqueries:

with poOrdQtyDtl as (
-- Form a Join between PO and PO_ITEM to get Total Ordered Qty Per PO
select
    po.po_id,
    po.po_no,
    sum(pt.qty) OrdPoQty
from PO
inner join PO_ITEM pt on pt.po_id = po.po_id
group by po.po_id, po.po_no
)
select
    poOrdQtyDtl.po_no [PO No.],
    poOrdQtyDtl.OrdPoQty [Ordered Qty For PO],
    sum(itemDely.currDelyQty) over (partition by poOrdQtyDtl.po_no) as [Delivered Qty For Po],
    itemDely.po_item_no [Item No.],
    itemDely.OrdItemQty [Ordred Item Qty],
    itemDely.DelItemQty [Delivered Item Qty],
    itemDely.dely_no [Dely No.],
    itemDely.currDelyQty [Item Qty Delivered in Current Dely]
from poOrdQtyDtl
inner join (
-- Join PO_ITEM and PO_ITEM_DELY to get Item Quantity details
select
    pt.po_id,
    pt.po_item_id,
    pt.po_item_no,
    pt.qty OrdItemQty,
    sum(pd.dely_qty) over (partition by pt.po_item_no) DelItemQty,
    pd.dely_no,
    pd.dely_qty currDelyQty
from PO_ITEM pt
inner join PO_ITEM_DELY pd on pd.po_item_id = pt.po_item_id
) itemDely on itemDely.po_id = poOrdQtyDtl.po_id
WHERE poOrdQtyDtl.po_no = 'PO1'
;

However, I was just wondering if there is an easier way of doing sums by more clever application of over partition by clause. The main challenge is with the Query below, as I cannot use distinct in partition by clause.

select
    po.po_no,
    -- sum (pt.qty) over (partition by distinct po.po_no, pt.po_item_no) TotPoQOrd, -- INCORRECT
    sum (pt.qty) over (partition by po.po_no, pt.po_item_no) TotPoQOrd,
    sum(pd.dely_qty) over (partition by po.po_no) TotPoQDely,
    pt.po_item_no,
    pt.qty,
    sum(pd.dely_qty) over (partition by po.po_no, pt.po_item_no) TotItemQ,
    pd.dely_no,
    pd.dely_qty
from PO
inner join PO_ITEM pt on pt.po_id = po.po_id
inner join PO_ITEM_DELY pd on pd.po_item_id = pt.po_item_id
where po.po_no = 'PO1'

Solution

  • Use multiple different window specs to solve this one:

        select
          x.po_no, 
    
          x.OrdPOQty,
          SUM(pd.dely_qty) OVER(PARTITION BY x.po_no) as DelyPOQty,
          
          x.po_item_no,
    
          x.OrdItemQty,
          SUM(pd.dely_qty) OVER(PARTITION BY x.po_no, x.po_item_no) as DelyItemQty,
           
          x.qty, 
          pd.dely_no, 
          pd.dely_qty
        from 
          ( 
            SELECT 
              po.po_id, po.po_no, pt.po_item_id, pt.po_item_no, pt.qty, 
              SUM(pt.qty) OVER(PARTITION BY po.po_no) as OrdPOQty, 
              SUM(pt.qty) OVER(PARTITION BY po.po_no, pt.po_item_no) as OrdItemQty
            FROM PO inner join PO_ITEM pt on pt.po_id = po.po_id
          ) x
          inner join PO_ITEM_DELY pd on pd.po_item_id = x.po_item_id
        where 
          x.po_no = 'PO1'
    
    

    Technically the partition by po_no is unnecessary because the where clause ensures there is only one, but i've left it in in case you want to expand the query to consider multiple po_no

    If you will always only ever query one po_no:

        select
          x.po_no, 
    
          x.OrdPOQty,
          SUM(pd.dely_qty) OVER() as DelyPOQty,
          
          x.po_item_no,
    
          x.OrdItemQty,
          SUM(pd.dely_qty) OVER(PARTITION BY x.po_item_no) as DelyItemQty,
           
          x.qty, 
          pd.dely_no, 
          pd.dely_qty
        from 
          ( 
            SELECT 
              po.po_id, po.po_no, pt.po_item_id, pt.po_item_no, pt.qty, 
              SUM(pt.qty) OVER(PARTITION BY po.po_no) as OrdPOQty, 
              SUM(pt.qty) OVER(PARTITION BY po.po_no, pt.po_item_no) as OrdItemQty
            FROM PO inner join PO_ITEM pt on pt.po_id = po.po_id
          ) x
          inner join PO_ITEM_DELY pd on pd.po_item_id = x.po_item_id
        where 
          x.po_no = 'PO1'
    

    wondering if there is an easier way of doing sums by more clever application of over partition by clause

    Well, essentially with the basic form you end up with N repeats of a row, and you could count the repeats and divide the sum of the values in the group by the number of repeats of the group, so you're summing values that are a third of their original but repeated 3 times for the same overall sum.. But I do feel it makes a bigger mess than just doing the summing and counting at the level where there is no cartesian product, and then that result just gets carried out and repeated..

    Or we could count only one of the items, presuming that every item will at least have an delivery #1:

    select
      po.po_no, 
    
      SUM(CASE WHEN pd.dely_no = 1 THEN pt.qty ELSE 0 END) OVER(PARTITION BY po.po_no) as OrdPOQty,
      SUM(pd.dely_qty) OVER(PARTITION BY po.po_no) as DelyPOQty,
      
      pt.po_item_no,
    
      SUM(CASE WHEN pd.dely_no = 1 THEN pt.qty ELSE 0 END) OVER(PARTITION BY po.po_no, pt.po_item_no) as OrdItemQty,
      SUM(pd.dely_qty) OVER(PARTITION BY po.po_no, pt.po_item_no) as DelyItemQty,
       
      pt.qty, 
      pd.dely_no, 
      pd.dely_qty
    from 
      PO
      inner join PO_ITEM pt on pt.po_id = po.po_id
      inner join PO_ITEM_DELY pd on pd.po_item_id = pt.po_item_id
    where 
      po.po_no = 'PO1'
    

    If you add another table that causes the pd.dely_no to have repeats of value 1 per po/po+item partitions then you'll need to extend the CASE logic