Search code examples
mysqlsqlcountinner-joinhaving-clause

How do I 'join' but keep things separate in tables in MySQL?


I have two tables. Table Employees:

EmployeeID (employees)  LastName (employees)    FirstName (employees)
1                       Davolio                 Nancy

And Table Orders:

OrderID (orders)    CustomerID (orders) EmployeeID (orders)
10248               90                  5
10278               45                  1
10238               47                  1

I redacted the full listing because it's hundreds of rows.

In the table Employees, the EmployeeID can uniquely identify an employee, meaning it will not repeat in the Employee table. However in the Table 'Order' The employeeID can repeat several times because an employee can sell help with many orders.

Anyway, I can see here that in the Orders table, an employeeID will repeat several times, which means I need to use COUNT(EmployeeID)>=2 somewhere in my MySQL code.

This is what I'd like:

EmployeeID              Number of Orders
1                       2

As you can see, the EmployeeID shows up twice in the "orders" table. So he sold 2 items, and it links to his 1 Employee ID.

So this is what I tried:

SELECT EmployeeID, COUNT(EmployeeID) FROM
employees A inner join 
orders B
ON (A.EmployeeID=B.EmployeeID)
WHERE COUNT(B.EmployeeID >=2)

This is the output:

Error: Column 'EmployeeID' in field list is ambiguous — ERROR CODE 1052

I'm not sure how I would get this result in this scenario.


Solution

  • There's no need to join with the employees table, you can get the employee ID from orders. You would only need to join if you also need other information from the employee table, such as their name.

    You need GROUP BY employeeID to get a count for each employee.

    >= 2 should not be inside the COUNT() function, you want to compare the result.

    You need to use HAVING rather than WHERE. WHERE is used to select the rows to process before aggregating.

    You should use COUNT(*) rather than COUNT(columnName) unless you need to exclude null values of the column from the count.

    If you give an alias to the COUNT(*) result, you can use that alias in the HAVING clause rather than restating the function.

    SELECT EmployeeID, COUNT(*) AS number_of_orders
    FROM orders B
    GROUP BY EmployeeID
    HAVING number_of_orders >= 2
    

    The reason for your error about the ambiguous column is because both tables have EmployeeID columns. In the SELECT list you need to specify A.EmployeeID or B.EmployeeID, just as you did in the ON clause.