Search code examples
sqloracle10ganalytic-functions

performant query using analytic function to select records with 2 date columns


I'm looking for a perfomant way to write a SQL query.

I've got a table with columns (id,fname,lname,accountid,creation_date,update_date) and i have to look for the records in that table that have the same fname,lname,accountid and have the most recent date based on greatest(max(creation_date),max(update_date)) (note that update_date can be null)

I expect that I need to use analytic functions

I have these cases:

(id,fname,lname,accountid,creation_date,update_date)
(1,'a','a','2','07/01/2010 10:59:43','07/01/2010 10:59:43')
(2,'a','a','2','07/01/2010 10:59:43','07/01/2010 10:59:43')
(3,'a','a','2','07/01/2010 10:59:43','07/01/2010 10:59:43')

I want to choose the last inserted : this record (3,'a','a','2','07/01/2010 10:59:43','07/01/2010 10:59:43')

(id,fname,lname,accountid,creation_date,update_date)
(3,'b','a','2','07/01/2009 10:59:43','07/01/2010 10:59:43')
(4,'b','a','2','07/01/2011 10:59:43',null)
(5,'b','a','2','07/01/2009 10:59:43','07/01/2009 10:59:43')

I want to choose the most recent one on both columns (creation_date,update_date) which is (4,'b','a','2','07/01/201110:59:43',null)

(id,fname,lname,accountid,creation_date,update_date)
(6,'c','g','4','07/01/2010 10:59:43',null)
(7,'c','g','4','07/01/2011 10:59:43',null)
(8,'c','g','4','07/01/2009 10:59:43',null)

I want to choose the most recent one on both columns (creation_date,update_date) which is (7,'c','g','4','07/01/2011 10:59:43',null)

(id,fname,lname,accountid,creation_date,update_date)
(9,'k','t','2','07/01/2009 10:59:43','07/01/2012 10:59:43')
(10,'k','t','2','07/01/2011 10:59:43',null)
(11,'k','t','2','07/01/2009 10:59:43','07/01/2009 10:59:43')

I want to choose the most recent one on both columns (creation_date,update_date) which is (9,'k','t','2','07/01/2009 10:59:43','07/01/2012 10:59:43')


Solution

  • You should use the analytic functions rank() or row_number(). My own particular preference is toward rank() but it only really works well if are partitioning by a unique index. Something like the following, assuming there is a unique index on fname, lname, accountid, creation_date

    select *
      from ( select a.*
                  , rank() over ( partition by fname, lname, accountid 
                          order by creation_date desc
                                  , update_date desc ) as rnk
               from my_table a )
     where rnk = 1
    

    This orders each combination of fname, lname, accountid by creation_date, update_date. Using where rnk = 1, enables you to then select the maximum creation_date, update_date.