Search code examples
mysqlselectleft-join

MySQL SELECT query from 2 tables producing inconsistent results


The goal is to return an array of available users that are assigned to the site we are working on and are available to do the task we want to assign.

For this question, we can assume that we are working on site_id 4 and we are trying to assign task_id 2.

site_tasks_assignments

id | user_id | site_id | task_id | complete
--------------------------------------------
12 | 6       | 4       | 2       | 1
13 | 4       | 3       | 2       | 0
14 | 2       | 4       | 3       | 0
15 | 6       | 4       | 4       | 1

users

id | name    | site_id 
-----------------------
1  | bob     | 1
2  | henry   | 4
3  | jane    | 2
4  | mary    | 3
5  | kyle    | 4
6  | steve   | 4

Users should show up in the results if:

  • The users.site_id is = the id of the site we're working at (again, assume site_id 4)

AND, any of the following are true:

  • the users.id does not appear in the site_task_assignments.user_id at all, meaning they have been assigned no tasks at any sites,
  • the user is in the site_task_assignments table but has not been assigned that specific task (task_id 2, in this case),
  • the user is in the site_task_assignments table and has been assigned that specific task AND the site_task_assignments.complete column for that row is 1

I expect:

id | name    | site_id 
-----------------------
2  | henry   | 4   <- not assigned to task 2, can work on site 4
5  | kyle    | 4   <- not assigned to any tasks, can work on site 4
6  | steve   | 4   <- WAS assigned to task 2, task 2 complete, can do task 2 again, can work on site 4

I tried several versions of a query, some involving some sort of JOIN, and a couple with subqueries, etc. The closest I've gotten:

SELECT u.*, sta.task_id, sta.user_id, sta.complete
FROM users AS u
LEFT JOIN site_task_assignments AS ci
ON u.id = sta.user_id
WHERE u.site_id = 4
AND (u.id NOT IN (SELECT user_id FROM site_task_assignments)
    OR (u.id IN (SELECT user_id FROM site_task_assignments)
        AND sta.task_id = 2 AND complete = 1)
    OR (u.id IN (SELECT user_id FROM site_task_assignments)
        AND (sta.task_id = 2 AND complete = 1
            OR sta.task_id != 2 AND complete = 0)))

This gives me inconsistent results. On some tasks, it's fine, on other tasks that have already been assigned it sometimes gives me users where they have already been assigned to a task but complete is 0. On tasks that have not been assigned at all, it gives me duplicates of the same user if they are already in the site_task_assignments table for other tasks.


Solution

  • I might be missing something but it seems that your three criteria can be rewritten as:

    • AND users.id is not currently assigned to incomplete task_id 2
    SELECT u.*, sta.task_id, sta.user_id, sta.complete
    FROM users AS u
    LEFT JOIN site_tasks_assignments AS sta
        ON u.id = sta.user_id
        AND u.site_id = sta.site_id
        AND sta.complete = 0
        AND sta.task_id = 2
    WHERE u.site_id = 4
    AND sta.id IS NULL;
    

    Here's a db<>fiddle.