Search code examples
sqloracledatesumgaps-and-islands

How to create a start and end date with no gaps from one date column and to sum a value within the dates


I am new SQL coding using in SQL developer.

I have a table that has 4 columns: Patient ID (ptid), service date (dt), insurance payment amount (insr_amt), out of pocket payment amount (op_amt). (see table 1 below)

What I would like to do is (1) create two columns "start_dt" and "end_dt" using the "dt" column where if there are no gaps in the date by the patient ID then populate the start and end date with the first and last date by patient ID, however if there is a gap in service date within the patient ID then to create the separate start and end date rows per patient ID, along with (2) summing the two payment amounts by patient ID with in the one set of start and end date visits (see table 2 below).

What would be the way to run this using SQL code in SQL developer? Thank you!

Table 1:

Ptid dt insr_amt op_amt
A 1/1/2021 30 20
A 1/2/2021 30 10
A 1/3/2021 30 10
A 1/4/2021 30 30
B 1/6/2021 10 10
B 1/7/2021 20 10
C 2/1/2021 15 30
C 2/2/2021 15 30
C 2/6/2021 60 30

Table 2:

Ptid start_dt end_dt total_insr_amt total_op_amt
A 1/1/2021 1/4/2021 120 70
B 1/6/2021 1/7/2021 30 20
C 2/1/2021 2/2/2021 30 60
C 2/6/2021 2/6/2021 60 30

Solution

  • You didn't mention the specific database so this solution works in PostgreSQL. You can do:

    select
      ptid,
      min(dt) as start_dt,
      max(dt) as end_dt,
      sum(insr_amt) as total_insr_amt,
      sum(op_amt) as total_op_amt
    from (
      select *,
        sum(inc) over(partition by ptid order by dt) as grp
      from (
        select *,
          case when dt - interval '1 day' = lag(dt) over(partition by ptid order by dt)
               then 0 else 1 end as inc
        from t
      ) x
    ) y
    group by ptid, grp
    order by ptid, grp
    

    Result:

     ptid  start_dt    end_dt      total_insr_amt  total_op_amt 
     ----- ----------  ----------  --------------  ----------- 
     A     2021-01-01  2021-01-04  120             70           
     B     2021-01-06  2021-01-07  30              20           
     C     2021-02-01  2021-02-02  30              60           
     C     2021-02-06  2021-02-06  60              30           
    

    See running example at DB Fiddle 1.

    EDIT for Oracle

    As requested, the modified query that works in Oracle is:

    select
      ptid,
      min(dt) as start_dt,
      max(dt) as end_dt,
      sum(insr_amt) as total_insr_amt,
      sum(op_amt) as total_op_amt
    from (
      select x.*,
        sum(inc) over(partition by ptid order by dt) as grp
      from (
        select t.*,
          case when dt - 1 = lag(dt) over(partition by ptid order by dt)
               then 0 else 1 end as inc
        from t
      ) x
    ) y
    group by ptid, grp
    order by ptid, grp
    

    See running example at db<>fiddle 2.