Here is my RDB structure.
I try to count the number of departments and employees related to a single location.
select street_address, count(distinct(d.department_id)), count(emp.employee_id)
from locations loc
inner join departments d
on d.location_id = loc.location_id
inner join employees emp
on emp.department_id =d.department_id
group by street_address
Query execution result:
But without using distinct for counting d.department_id it produces wrong result.
Could somebody explain what happens during query execution and why distinct fixes this issue?
The reason you are getting wrong count with count(d.department_id)
because there is multiple employees who related to same department_id and that is why you are getting same number of department and employees.
when you use count(distinct d.department_id)
, then distinct
will only count each department_id
once instead of counting every time it finds employee associated with department_id
.