Search code examples
sqloracleanalytics

Select start/end date for each group


I have an employee job assignment table in the form shown below:

emp_id, dept_id, assignment,  start_dt,    end_dt
1,      10,      project 1,   2001-01-01,  2001-12-31
1,      10,      project 2,   2002-01-01,  2002-12-31
1,      20,      project 3,   2003-01-01,  2003-12-31
1,      20,      project 4,   2004-01-01,  2004-12-31
1,      10,      project 5,   2005-01-01,  2005-12-31

From the above table i need to summarize employee department history i.e the duration an employee worked for a particular department before being transferred to some other department.

Expected output result is shown below:

emp_id, dept_id,  start_dt,    end_dt
1,      10,       2001-01-01,  2002-12-31
1,      20,       2003-01-01,  2004-12-31
1,      10,       2005-01-01,  2005-12-31

I have tried to solve the above problem using the oracle analytics functions but was unable to get the desired output

    select distinct emp_id, dept_id, start_dt, end_dt 
    from ( 
       select emp_id, dept_id, 
              min(start_date) 
                 over (partition by emp_id, dept_id order by emp_id, dept_id 
                 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as start_dt,
              max(end_date)   
                 over (partition by emp_id, dept_id order by emp_id, dept_id 
                 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as end_dt
       from employee_job_assignment
    )
    where emp_id = 1;

The above query results in the following output:

emp_id, dept_id,  start_dt,    end_dt
1,      10,       2001-01-01,  2005-12-31
1,      20,       2003-01-01,  2004-12-31

Solution

  • The key to the solution is to separate the rows into groups according to your logic. You can do that using the LAG() function. For example:

    select
      max(emp_id) as emp_id,
      max(dept_id) as dept_id,
      min(start_dt) as start_dt,
      max(end_dt) as end_dt
    from (
      select
        *,
        sum(inc) over(partition by emp_id order by start_dt) as grp
      from (
        select
          *,
          case when lag(dept_id) over(partition by emp_id order by start_dt) 
                    <> dept_id then 1 else 0 end as inc
        from employee_job_assignment
      ) x
    ) y
    group by grp
    order by grp