I am trying to sum the days on the job by employee:
on_job_ind = Y
job_id
to GROUP BY
, because that would make my life
too easyAs you can see in the table below, my SUM()
is summing everthing where on_job_ind = 'Y'
. The last column in the table below, correct_sum
is how the days_on_job need to be totaled up by date groups.
I do not know how to "break" the summation by date groups when encountering a row where on_job_ind = 'N'
.
I have chased my tail using LAG
and LEAD
to no avail.
Is what I'm trying to do possible in SQL?
My SQL thus far:
SELECT emp_nbr
, job_start_date AS act_job_start_dt
, job_end_dt AS act_job_end_dt
, MIN(job_start_date) over (PARTITION BY emp_nbr, on_job_ind ORDER BY job_end_dt ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS min_job_start_dt
, MAX(job_end_dt) over (PARTITION BY emp_nbr, on_job_ind ORDER BY job_end_dt ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS max_job_end_dt
, on_job_ind
, days_on_job
, SUM(days_on_job) over (PARTITION BY emp_nbr, on_job_ind ORDER BY job_end_dt ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS total_days_on_job
FROM emp_job_tbl a
ORDER BY 1, 3 DESC
Table data:
emp_nbr | act_job_start_dt | act_job_end_dt | min_job_start_dt | min_job_end_dt | on_job_ind | days_on_job | total_days_on_job | correct_sum |
---|---|---|---|---|---|---|---|---|
15968 | 2/19/2023 | 3/3/2023 | 3/2/2022 | 3/3/2023 | Y | 12 | 266 | 12 |
15968 | 2/15/2023 | 2/19/2023 | 3/16/2022 | 2/19/2023 | N | 0 | 0 | |
15968 | 2/1/2023 | 2/15/2023 | 3/2/2022 | 2/15/2023 | Y | 14 | 254 | 28 |
15968 | 1/18/2023 | 2/1/2023 | 3/2/2022 | 2/1/2023 | Y | 14 | 240 | |
15968 | 1/7/2023 | 1/18/2023 | 3/16/2022 | 1/18/2023 | N | 0 | 0 | |
15968 | 1/2/2023 | 1/7/2023 | 3/2/2022 | 1/7/2023 | Y | 5 | 226 | 18 |
15968 | 12/30/2022 | 1/2/2023 | 3/2/2022 | 1/2/2023 | Y | 3 | 221 | |
15968 | 12/20/2022 | 12/30/2022 | 3/2/2022 | 12/30/2022 | Y | 10 | 218 | |
15968 | 12/19/2022 | 12/20/2022 | 3/16/2022 | 12/20/2022 | N | 0 | 0 | |
15968 | 12/16/2022 | 12/19/2022 | 3/2/2022 | 12/19/2022 | Y | 3 | 208 | 30 |
15968 | 12/1/2022 | 12/16/2022 | 3/2/2022 | 12/16/2022 | Y | 15 | 205 | |
15968 | 11/19/2022 | 12/1/2022 | 3/2/2022 | 12/1/2022 | Y | 12 | 190 | |
15968 | 11/1/2022 | 11/19/2022 | 3/16/2022 | 11/19/2022 | N | 0 | 0 | |
15968 | 10/24/2022 | 11/1/2022 | 3/16/2022 | 11/1/2022 | N | 0 | 0 | |
15968 | 10/7/2022 | 10/24/2022 | 3/16/2022 | 10/24/2022 | N | 0 | 0 | |
15968 | 10/2/2022 | 10/7/2022 | 3/2/2022 | 10/7/2022 | Y | 5 | 178 | 19 |
15968 | 9/18/2022 | 10/2/2022 | 3/2/2022 | 10/2/2022 | Y | 14 | 173 | |
15968 | 9/14/2022 | 9/18/2022 | 3/16/2022 | 9/18/2022 | N | 0 | 0 | |
15968 | 9/1/2022 | 9/14/2022 | 3/2/2022 | 9/14/2022 | Y | 13 | 159 | 57 |
15968 | 8/18/2022 | 9/1/2022 | 3/2/2022 | 9/1/2022 | Y | 14 | 146 | |
15968 | 8/1/2022 | 8/18/2022 | 3/2/2022 | 8/18/2022 | Y | 17 | 132 | |
15968 | 7/19/2022 | 8/1/2022 | 3/2/2022 | 8/1/2022 | Y | 13 | 115 | |
15968 | 7/1/2022 | 7/19/2022 | 3/16/2022 | 7/19/2022 | N | 0 | 0 | |
15968 | 6/30/2022 | 7/1/2022 | 3/16/2022 | 7/1/2022 | N | 0 | 0 | |
15968 | 6/20/2022 | 6/30/2022 | 3/2/2022 | 6/30/2022 | Y | 10 | 102 | 10 |
15968 | 6/11/2022 | 6/20/2022 | 3/16/2022 | 6/20/2022 | N | 0 | 0 | |
15968 | 6/9/2022 | 6/11/2022 | 3/16/2022 | 6/11/2022 | N | 0 | 0 | |
15968 | 6/1/2022 | 6/9/2022 | 3/2/2022 | 6/9/2022 | Y | 8 | 92 | 53 |
15968 | 5/30/2022 | 6/1/2022 | 3/2/2022 | 6/1/2022 | Y | 2 | 84 | |
15968 | 5/19/2022 | 5/30/2022 | 3/2/2022 | 5/30/2022 | Y | 11 | 82 | |
15968 | 5/2/2022 | 5/19/2022 | 3/2/2022 | 5/19/2022 | Y | 17 | 71 | |
15968 | 4/17/2022 | 5/2/2022 | 3/2/2022 | 5/2/2022 | Y | 15 | 54 | |
15968 | 4/14/2022 | 4/17/2022 | 3/16/2022 | 4/17/2022 | N | 0 | 0 | |
15968 | 3/30/2022 | 4/14/2022 | 3/2/2022 | 4/14/2022 | Y | 15 | 39 | 25 |
15968 | 3/20/2022 | 3/30/2022 | 3/2/2022 | 3/30/2022 | Y | 10 | 24 | |
15968 | 3/16/2022 | 3/20/2022 | 3/16/2022 | 3/20/2022 | N | 0 | 0 | |
15968 | 3/2/2022 | 3/16/2022 | 3/2/2022 | 3/16/2022 | Y | 14 | 14 | 14 |
The trick to this one is that you actually have to consolidate 2 of the rows based on the fact that one is ending at the same time another one is beginning.
What I do is start by flagging a row for whether or not is it a "start" of a group or not. In this case, I used a window function. (Line 14)
If for some reason you can't get this to work with Teradata, please just message me because I know I've solved this on Teradata before. I no longer have access to Teradata so I tried my best. But I have solved this exact problem on Teradata many times so I know it's possible.
Here is the code that gives the result you're looking for, using Snowflake syntax. Hopefully only minor adjustments are needed.
For more information about solving this problem, the keywords to google would be "Consolidating overlapping time ranges" or the "Gaps and Islands Problem."
WITH VALID_DATA AS (
SELECT
*
FROM
RASGO.STACKOVERFLOW_Q75700970
WHERE
ON_JOB_IND = 'Y'
),
GROUP_STARTS AS (
SELECT
EMP_NBR,
ACT_JOB_START_DT,
ACT_JOB_END_DT,
CASE WHEN ACT_JOB_START_DT > max(ACT_JOB_END_DT) OVER (
PARTITION BY EMP_NBR
ORDER BY
ACT_JOB_START_DT,
ACT_JOB_END_DT ROWS BETWEEN UNBOUNDED preceding
AND 1 preceding
) THEN 1 ELSE 0 END GROUP_START
FROM
VALID_DATA
),
GROUPS AS (
SELECT
EMP_NBR,
ACT_JOB_START_DT,
ACT_JOB_END_DT,
sum(GROUP_START) OVER (
PARTITION BY EMP_NBR
ORDER BY
ACT_JOB_START_DT,
ACT_JOB_END_DT
) grp
FROM
GROUP_STARTS
),
CONSOLIDATED AS (
SELECT
EMP_NBR,
min(ACT_JOB_START_DT) GROUP_START_DT,
max(ACT_JOB_END_DT) GROUP_END_DT
FROM
GROUPS
GROUP BY
EMP_NBR,
grp
)
SELECT
*,
DATEDIFF(
'DAY', GROUP_START_DT, GROUP_END_DT
) AS total_days_on_job
FROM
CONSOLIDATED
ORDER BY
2
Result:
+---------+----------------+--------------+-------------------+
| EMP_NBR | GROUP_START_DT | GROUP_END_DT | TOTAL_DAYS_ON_JOB |
+---------+----------------+--------------+-------------------+
| 15968 | 3/2/22 | 3/16/22 | 14 |
| 15968 | 3/20/22 | 4/14/22 | 25 |
| 15968 | 4/17/22 | 6/9/22 | 53 |
| 15968 | 6/20/22 | 6/30/22 | 10 |
| 15968 | 7/19/22 | 9/14/22 | 57 |
| 15968 | 9/18/22 | 10/7/22 | 19 |
| 15968 | 11/19/22 | 12/19/22 | 30 |
| 15968 | 12/20/22 | 1/7/23 | 18 |
| 15968 | 1/18/23 | 2/15/23 | 28 |
| 15968 | 2/19/23 | 3/3/23 | 12 |
+---------+----------------+--------------+-------------------+