Apologies for asking a question that may have answers in some form or another on here, but I was unable to make any of those solutions work for me.
I have the following query:
SELECT `user_id`, `application_id`, `unallocated_date`, `check_in_date`, `check_out_date`
FROM `student_room`
WHERE `user_id` = 17225
ORDER BY `application_id` DESC
It produces the following result:
user_id | application_id | unallocated_date | check_in_date | check_out_date
--------+----------------+---------------------+---------------------+---------------
17225 | 30782 | 2018-02-04 14:32:29 | NULL | NULL
17225 | 30782 | 2018-02-04 14:32:49 | NULL | NULL
17225 | 30782 | 2018-02-04 14:32:51 | NULL | NULL
17225 | 30782 | NULL | NULL | NULL
17225 | 30782 | NULL | 2018-02-04 14:41:54 | NULL
The fourth row in the result is a fault in my data; it should look similar to the first three rows - these rows occur happens when a student was allocated a new room, and the previous one needs to be unallocated. In this case, the unallocation of row 4 did not actually happen due to either a historical bug in the system I am working on, or user error, but most likely the former.
How can I identify ALL such rows? My attempts with GROUP BY
and HAVING
did not work, as I checked where all three date fields were NULL, but it did not pick up this particular user - so I was doing something wrong. My original query was:
SELECT COUNT(user_id) AS `count`, user_id FROM `student_room`
WHERE `unallocated_date` IS NULL
AND `check_in_date` IS NULL
AND `check_out_date` IS NULL
GROUP BY `user_id`
HAVING COUNT(user_id) > 1
ORDER BY `user_id` ASC
I tried various INNER JOIN
attempts too, but I did not of them right...
The rows that I am interested in will have at least one entry with all three dates NULL, but also one where there is a check_in_date that is NOT NULL, as per this example. If I only had the first four rows, then the data could be correct, but the fifth row's presence makes the fourth row a faulty record - it should've been given an "unallocated_date" value at the time of the allocation of the room in the fifth row, which for some reason did not happen.
Together with a friend of mine, we made the following query that works. I have now learned that you can use "EXISTS" in MySQL. I saw it used when dropping or creating tables, but never like this. It ended up that this query solves the problem:
SELECT cte.user_id, COUNT(*)
FROM (
SELECT sro.user_id
FROM student_room AS sro
WHERE sro.unallocated_date IS NULL
AND sro.check_in_date IS NULL
AND sro.check_out_date IS NULL
AND EXISTS (
SELECT *
FROM student_room AS sri
WHERE sri.user_id = sro.user_id
AND sri.student_room_id > sro.student_room_id
)
ORDER BY user_id DESC
)
AS cte
GROUP BY cte.user_ID
ORDER BY COUNT(*) DESC
This query is the result of more than an hour of tinkering with records that was erroneous, so apologies if this appears to not match the question's requirements 100%, but it does solve the problem for me.