How do I optimize this query to get the same result, without taking as long? The NOT IN
subquery takes a very long time.
SELECT DISTINCT EmployeeId FROM employees
WHERE
status = 'Active'
&& BranchId = '2'
&& NOT EXISTS (
SELECT * FROM attendance
WHERE
employees.EmployeeId = attendance.EmployeeId
&& attendance.AttendanceDate = '2015-01-20'
)
)
SELECT EmployeeId FROM employees
WHERE
status = 'Active'
&& BranchId = '2'
&& NOT IN (
SELECT EmployeeId FROM attendance WHERE AttendanceDate='2015-01-20'
)
Here is another version of your query
select
distinct e.EmployeeId FROM employees e
left join attendance a on e.EmployeeId = a.EmployeeId and a.AttendanceDate = '2015-01-20'
where
e.status='Active'
and e.BranchId= '2'
and a.EmployeeId is null
You will also need some indexes to be applied on the tables as
alter table employees add index st_br_idx(status,BranchId);
alter table AttendanceDate add index AttendanceDate_idx(AttendanceDate);
if EmployeeId is a foreign key then no need to add the index else if the index is not already there you may need the following as well
alter table AttendanceDate add index EmployeeId_idx(EmployeeId);
If EmployeeId
is a primary key in employees
then its already indexed if not and not indexed you may need to add index for that as well
alter table employees add index EmployeeId_idx(EmployeeId);
You may also check your original query after having above indexes
SELECT DISTINCT e.EmployeeId FROM employees e
WHERE
e.status='Active'
and e.BranchId= '2'
and NOT EXISTS (
SELECT 1 FROM
attendance a WHERE e.EmployeeId = a.EmployeeId
and a.AttendanceDate='2015-01-20'
)
To analyze the query use explain select..
and see how optimizer is using the indexes
and possible number of rows the optimizer may scan for retrieving the records