I am currently encountering challenges in comprehending the implementation of recursive search within Oracle SQL.
My encounter with a HackerRank question necessitates the application of this concept. Although I refrain from disclosing the specific HackerRank inquiry, I intend to leverage it as a reference to devise the logic independently. However, I seek assistance in dissecting the problem and understanding its intricacies.
Should anyone possess alternative methodologies to achieve the desired outcome, I am open to exploring diverse solutions. Your contributions and insights are highly valued, and I eagerly anticipate your responses.
Below is my sceanrio.
I have a table called : employee
It as 3
columns -> emp_id , emp_date , emp_task
Data is like below
EMP_ID | EMP_DATE | EMP_TASK
A1 | 01-04-2024 | 345
A2 | 01-04-2024 | 546
A3 | 01-04-2024 | 232
A4 | 01-04-2024 | 8000
A5 | 01-04-2024 | 2344
A1 | 02-04-2024 | 456
A2 | 02-04-2024 | 9280
A3 | 02-04-2024 | 324
A2 | 02-04-2024 | 754
A8 | 02-04-2024 | 75
A2 | 03-04-2024 | 400
A3 | 03-04-2024 | 234
A3 | 04-04-2024 | 100
And the condition is to calculate the total number of emp_id
participated.
It is clearly mentioned that : get count of those emp_id who participated in each day.
Make sure they were their from initil earlier date.
Here is the logic I need to implement recursively but not known how to start writing query ?.
If you see Day 1 : 01-04-2024
-> emp_id participated -> A1 , A2 , A3 , A4 , A5
If you see Day 2 : 02-04-2024 -> emp_id participated ->
A1 , A2 , A3`. They are the one who participated in Day 1 as well so no of emp_id participated is 3
Similarly it continues till end.
Note : First day who participated need to be consider till end. They are participating.
A8
as joined in middle as this emp_id
is not part of first day so in count it is not considered.
So my expected output is below need to be achieved
emp_date cout_participated
01-04-2024 5
02-04-2024 3
03-04-2024 2
04-04-2024 1
My trials not implemented fully
WITH get_initial_day_emp_participated AS (
-- Fetching all emp_id who initially participated
SELECT
*
FROM
employee
WHERE
emp_date = (
SELECT
MIN(emp_date)
FROM
employee
)
), get_count_initial_day_participated AS (
-- Using group getting the count of emp_id who initially participated
SELECT
emp_date,
COUNT(DISTINCT emp_id)
FROM
get_initial_day_emp_participated
GROUP BY
emp_date
), get_details_except_first_date AS (
-- Fetching all details except first day participated
SELECT
*
FROM
employee
WHERE
emp_date != (
SELECT
MIN(emp_date)
FROM
employee
)
)
SELECT
*
FROM
get_details_except_first_date;
For you reference giving create table and insert details
create table employee
(
EMP_ID varchar2(10),
EMP_DATE date,
EMP_TASK number
);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A1',to_date('01-04-2024','DD-MM-YYYY'),345);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A2',to_date('01-04-2024','DD-MM-YYYY'),546);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A3',to_date('01-04-2024','DD-MM-YYYY'),232);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A4',to_date('01-04-2024','DD-MM-YYYY'),8000);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A5',to_date('01-04-2024','DD-MM-YYYY'),2344);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A1',to_date('02-04-2024','DD-MM-YYYY'),456);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A2',to_date('02-04-2024','DD-MM-YYYY'),9280);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A3',to_date('02-04-2024','DD-MM-YYYY'),324);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A2',to_date('02-04-2024','DD-MM-YYYY'),754);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A8',to_date('02-04-2024','DD-MM-YYYY'),75);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A2',to_date('03-04-2024','DD-MM-YYYY'),400);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A3',to_date('03-04-2024','DD-MM-YYYY'),234);
Insert into employee (EMP_ID,EMP_DATE,EMP_TASK) values ('A3',to_date('04-04-2024','DD-MM-YYYY'),100);
This isn't recursive, but an alternative method is to count the number of days that have elapsed, and the number of days that an emp_id has been active:
select emp_id, emp_date,
emp_date - min(emp_date) over() + 1 as day_num,
count(*) over (partition by emp_id order by emp_date) emp_day_num
from (
select distinct emp_id, emp_date
from employee
)
order by emp_date, emp_id
EMP_ID | EMP_DATE | DAY_NUM | EMP_DAY_NUM |
---|---|---|---|
A1 | 01-APR-24 | 1 | 1 |
A2 | 01-APR-24 | 1 | 1 |
A3 | 01-APR-24 | 1 | 1 |
A4 | 01-APR-24 | 1 | 1 |
A5 | 01-APR-24 | 1 | 1 |
A1 | 02-APR-24 | 2 | 2 |
A2 | 02-APR-24 | 2 | 2 |
A3 | 02-APR-24 | 2 | 2 |
A8 | 02-APR-24 | 2 | 1 |
A2 | 03-APR-24 | 3 | 3 |
A3 | 03-APR-24 | 3 | 3 |
A3 | 04-APR-24 | 4 | 4 |
The subquery is because count(distinct emp_id) over ...
isn't allowed.
The min(emp_date) over()
is the analytic version on min()
, and gives the earliest date across the whole data set, but available as a value to every row. Then emp_date - min(emp_date) over()
gives the offset of each row's date from that minimum; adding 1 just makes it 1-4 instead of 0-3 to make it easier to compare.
count(*) over (partition by emp_id order by emp_date)
is an analytic count that shows how many dates (distinct, because of the subquery) each emp_id
has appeared so far, up to the current row's date. If an ID starts late or skips a date that count will no longer align with the day number.
Then see if they match, and count those that do:
select emp_date, count(emp_id)
from (
select emp_id, emp_date,
emp_date - min(emp_date) over() + 1 as day_num,
count(*) over (partition by emp_id order by emp_date) emp_day_num
from (
select distinct emp_id, emp_date
from employee
)
)
where emp_day_num = day_num
group by emp_date
order by emp_date
EMP_DATE | COUNT(EMP_ID) |
---|---|
01-APR-24 | 5 |
02-APR-24 | 3 |
03-APR-24 | 2 |
04-APR-24 | 1 |
If you want a recursive approach then you could use a recursive CTE:
with rcte (emp_id, emp_date) as (
select emp_id, emp_date
from employee
where emp_date = (
select min(emp_date) from employee
)
union all
select r.emp_id, e.emp_date
from rcte r
join employee e
on e.emp_id = r.emp_id
and e.emp_date = r.emp_date + 1
)
select emp_date, count(distinct emp_id)
from rcte
group by emp_date
order by emp_date
EMP_DATE | COUNT(DISTINCTEMP_ID) |
---|---|
01-APR-24 | 5 |
02-APR-24 | 3 |
03-APR-24 | 2 |
04-APR-24 | 1 |
The anchor branch is similar to your first CTE, the recursive branch looks for a matching ID the next day. For each row in the previous rcte
result, it looks in employee
for a row with same ID and the next day (which is r.emp_date + 1
). Once there is a gap for an ID it stop appearing.