Search code examples
sqloracle-databasedatetimesql-order-bygreatest-n-per-group

Remove duplicated record in sql


What I'm trying to do is take records that looks like this:

ID  NAME    STARTDATE   ENDDATE EnrollmentMonth
1   PETER   20190101    20200729    20200701    
1   TONY    20200730    99991231    20200701

and change it to look like this:

ID  NAME    STARTDATE   ENDDATE EnrollmentMonth
1   PETER   20190101    20200729    20200701    

Excluding the name that has less enrollment days, Peter enrolled for 29 days in July while Tony enrolled for 2 days in July.

The idea is to pick only one name in July.

I am having difficult time making this work. Any help would be appreciated.

Thanks


Solution

  • If you want just one record for July, which has the lest number of enrolment days, you can use date arithmetics, order by and fetch first:

    select *
    from mytable
    where enrollmentmonth = date '2020-07-01'
    order by least(enddate, date '2020-07-31') - greatest(startdate, '2020-07-01')
    fetch first row with ties
    

    We can implement the same logic on a per-month basis using window functions:

    select *
    from (
        select t.*, 
            rank() over(
                partition by enrollmentmonth 
                order by least(enddate, last_day(enrollmentmonth)) - greatest(startdate, trunc(enrollmentmonth, 'month'))
            ) rn
        from mytable t
    ) t
    where rn = 1