An employee is hired for different departments. For example, they could be working in Marketing and be paid by Department 0001 and also Department 0002 depending on the hours they do. For this question, hours doesn't matter as I truly only need to find the date ranges when the employee is NOT hired. The below two employees are examples of those.
Employee_ID | Department_ID | Start_Date | End_Date |
---|---|---|---|
395961 | 0001 | 01/01/2022 | 03/31/2022 |
395961 | 0002 | 01/01/2022 | 01/31/2022 |
395961 | 0001 | 06/01/2022 | 10/31/2022 |
395961 | 0001 | 11/01/2022 | 12/31/2022 |
395962 | 0001 | 01/01/2022 | 03/31/2022 |
395962 | 0002 | 01/01/2022 | 01/31/2022 |
395962 | 0002 | 02/01/2022 | 04/30/2022 |
395962 | 0001 | 06/01/2022 | 10/31/2022 |
395962 | 0001 | 11/01/2022 | 12/31/2022 |
395963 | 0001 | 01/01/2022 | 05/31/2022 |
395963 | 0002 | 06/01/2022 | 12/31/2022 |
Employee_ID 395961 and 395962 should return for the query as both have date range gaps. 395961 does not have a date range from 04/01/2022 until 05/31/2022. 395962 does not have a date range from 05/01/2022 until 05/31/2022. 395963 does NOT have a date range gap.
I had in mind to return the months the employee had date ranges and group them by month as we can get duplicate returns. For example, Employee_ID 395961 would return Jan, Feb, March, Jan, Jun, Jul, Aug, Sep, Oct, Nov, Dec and when we group the months, it would remove the duplicate Jan. Then we check which months are missing and if anything returns, we should consider to return this Employee_ID as result.
I truly have NOT figured out how to get a correct result, but I've done some research to find the below that could help. I wouldn't want to use variables as I want this query applied to 1000's of records.
SELECT DATENAME(MONTH, DATEADD(MONTH, x.number, @Start_Date)) AS MonthName
FROM master.dbo.spt_values x
WHERE x.type = 'P'
AND x.number <= DATEDIFF(MONTH, @Start_Date, @End_Date)
Edit: The database is Microsoft SQL Server 2017.
I think this should give you what you are looking for:
WITH cal (date) AS (
SELECT @Start_Date
UNION ALL
SELECT DATEADD(day, 1, date)
FROM cal
WHERE date < @End_Date
)
SELECT e.Employee_ID, COUNT(cal.date) AS Days_Not_Hired
FROM cal
CROSS JOIN (SELECT DISTINCT Employee_ID FROM employees) e
LEFT JOIN employees e2
ON cal.date BETWEEN e2.Start_Date AND e2.End_Date
AND e.Employee_ID = e2.Employee_ID
WHERE e2.Employee_ID IS NULL
GROUP BY e.Employee_ID
OPTION(MAXRECURSION 365);
This is targeted at SQL Server. It may need tweaking for other dialects.
To run for all employees for all time you could do:
WITH cal (date) AS (
SELECT MIN(Start_Date) FROM employees
UNION ALL
SELECT DATEADD(day, 1, date)
FROM cal
WHERE date < CAST( GETDATE() AS DATE )
)
SELECT e.*, COUNT(cal.date) AS Days_Not_Hired
FROM cal
JOIN (
SELECT
Employee_ID,
MIN(Start_Date) AS Min_Start_Date,
MAX(End_Date) AS Max_End_Date
FROM employees
GROUP BY Employee_ID
) e
ON cal.date BETWEEN e.Min_Start_Date AND e.Max_End_Date
LEFT JOIN employees e2
ON cal.date BETWEEN e2.Start_Date AND e2.End_Date
AND e.Employee_ID = e2.Employee_ID
WHERE e2.Employee_ID IS NULL
GROUP BY e.Employee_ID, e.Min_Start_Date, e.Max_End_Date
OPTION(MAXRECURSION 0);
I very rarely use SQL Server so I have no idea how its optimiser is going to deal with this query.