Search code examples
sqloraclegreatest-n-per-group

Oracle SQL: Get the max record for each duplicate ID in self join table


It's been marked as a duplicate and seems to be explained a bit in the linked questions, but I'm still trying to get the separate DEBIT and CREDIT columns on the same row.

I've created a View and I am currently self joining it. I'm trying to get the max Header_ID for each date.

My SQL is currently:

SELECT DISTINCT
TAB1.id, 
TAB1.glperiods_id, 
MAX(TAB2.HEADER_ID),
TAB1.batch_date, 
TAB1.debit, 
TAB2.credit, 
TAB1.descrip

FROM
IQMS.V_TEST_GLBATCH_GJ TAB1

LEFT OUTER JOIN
IQMS.V_TEST_GLBATCH_GJ TAB2
ON
TAB1.ID = TAB2.ID AND TAB1.BATCH_DATE = TAB2.BATCH_DATE AND TAB1.GLPERIODS_ID = TAB2.GLPERIODS_ID AND TAB1.DESCRIP = TAB2.DESCRIP AND TAB1.DEBIT <> TAB2.CREDIT

WHERE
TAB1.ACCT = '3648-00-0'
AND
TAB1.DESCRIP NOT LIKE '%INV%'
AND TAB1.DEBIT IS NOT NULL

GROUP BY
TAB1.id, 
TAB1.glperiods_id, 
TAB1.batch_date, 
TAB1.debit, 
TAB2.credit, 
TAB1.descrip

ORDER BY TAB1.batch_date

And the output for this is (37 rows in total):

Some rows

I'm joining the table onto itself to get DEBIT and CREDIT on the same line. How do I select only the rows with the max HEADER_ID per BATCH_DATE ?

Update

For @sagi

Those highlighted with the red box are the rows I want and the ones in blue would be the ones I'm filtering out.

enter image description here

Fixed mistake

I recently noticed I had joined my table onto itself without making sure TAB2 ACCT='3648-00-0'.

The corrected SQL is here:

SELECT DISTINCT
TAB1.id, 
TAB1.glperiods_id, 
Tab1.HEADER_ID,
TAB1.batch_date, 
TAB1.debit, 
TAB2.credit, 
TAB1.descrip

FROM
IQMS.V_TEST_GLBATCH_GJ TAB1

LEFT OUTER JOIN
IQMS.V_TEST_GLBATCH_GJ TAB2
ON
TAB1.ID = TAB2.ID AND TAB1.BATCH_DATE = TAB2.BATCH_DATE AND TAB2.ACCT ='3648-00-0'AND TAB1.GLPERIODS_ID = TAB2.GLPERIODS_ID AND TAB1.DESCRIP = TAB2.DESCRIP AND TAB1.DEBIT <> TAB2.CREDIT

WHERE
TAB1.ACCT = '3648-00-0'
AND
TAB1.DESCRIP NOT LIKE '%INV%'
AND TAB1.DEBIT IS NOT NULL

ORDER BY TAB1.BATCH_DATE

Solution

  • Use window function like ROW_NUMBER() :

    SELECT s.* FROM (
        SELECT t.*,
               ROW_NUMBER() OVER(PARTITION BY t.batch_id ORDER BY t.header_id DESC) as rnk
        FROM YourTable t
        WHERE t.ACCT = '3648-00-0'
          AND t.DESCRIP NOT LIKE '%INV%'
          AND t.DEBIT IS NOT NULL) s
    WHERE s.rnk = 1
    

    This is an analytic function that rank your record by the values provided in the OVER clause.

    PARTITION - is the group
    ORDER BY - Who's the first of this group (first gets 1, second 2, ETC)
    

    It is a lot more efficient then joins(Your problem could have been solved in many ways) , and uses the table only once.