ID FirstName LastName Gender Salary
1 Ben Hoskins Male 70000
2 Mark Hastings Male 60000
4 Ben Hoskins Male 70000
8 John Stanmore Male 80000
While running the query:
select *
from Employees
where Salary > (SELECT AVG(distinct SALARY) FROM employees)
It is displaying 2 records for 7000 it should display one. Why is it showing 2 records if I have used distinct?
You have used distinct
the subquery, not in the outer query, so the outer query still sees the data with the duplicates.
You could solve this using:
select e.FirstName, e.LastName, e.Gender, e.Salary
from Employees e
where e.Salary > (SELECT AVG(distinct e2.SALARY) FROM employees e2);
However, something seems seriously wrong with the data model if you have rows with such duplicates. The data should be fixed.
In the meantime, you can work around the problem. You can phrase your query using a CTE that eliminates the duplicates:
with e as (
select e.*
from (select e.*,
row_number(). over (partition by firstname, lastname, gender order by id desc) as seqnum
from employees e
)
where seqnum = 1
)
select e.*
from e
where e.salary > (select avg(salary) from e)