Search code examples
sqlgroup-byhavinghaving-clause

Interview Question about SQL group by and having


This problem is from the following

https://www.programmerinterview.com/index.php/database-sql/advanced-sql-interview-questions-and-answers/

Assume we have two tables:

Salesperson 
ID  Name    Age Salary


Orders
Number  order_date  cust_id salesperson_id  Amount

The question is following:

We want to retrieve the names of all salespeople that have more than 1 order from the tables above. You can assume that each salesperson only has one ID. I would probably also assume that names are all distinct.

My answer was this.

select Name from 
salesperson S inner join Orders O 
on S.ID=O.salesperson_id 
group by Name 
having count(number) >=2

However, the given answer is following:

SELECT Name
FROM Orders inner join Salesperson
On Orders.salesperson_id = Salesperson.ID
GROUP BY salesperson_id, NAME
Having count(salesperson_id) > 1

If name and salesperson_id is one to one, is there any reason we have to add salesperson_id into the group by statement here? Also, if name and salesperson_id relationship is just one to one, wouldn't count(salesperson_id) be always 1 if we group by salesperson_id, name?

I was a bit confused about this, and I was wondering if anybody encountered this problem before and found this weird as well.


Solution

  • Both your solution and the accepted one are functionally identical, except for the GROUP BY clause.

    The likely reason why the accepted solution is aggregating both by Name and salesperson_id is that it could be the case that two or more salespeople happen to have the same name. Should this occur, your query would report only a single name, but with aggregate results from more than one salesperson. But, the combination of salesperson_id and Name should always be unique.

    Other than this, I actually prefer your version, and I would start joining from the salesperson table out to the Orders table.