Search code examples
mysqlsqlleft-joinunionright-join

MySQL Left Join Right Join union returns incorrect data


My classmates and I have been run rampant trying to get this query to work correctly.

Goal: Return a single table that consists of two tables fully joined by means of a left join and then a right join. There will be null values. The purpose of this table is to graph both speed and weight with speed on the left Y-axis and weight on the right Y-axis in order to track how weight coincides with speed. Some values will have shared dates.

The Assumptions:

  • A player cannot record their weight multiple times in a day.
  • A player cannot have multiple sprint speed entries in a single day.
  • No date can exist independent of AT LEAST a sprint value OR a weight. They must have AT LEAST one of the values associated with it.

The Issue: Several rows are returning values when they should be NULL instead, and we have no clue where this data is coming from.

Current Incorrect Query:

SELECT weight.username, weight.weight, metric.speedOrDistance, weighInDate  
FROM playerweight as weight LEFT JOIN sprint AS metric ON
weight.weighInDate = metric.workoutDate 
WHERE weight.username = 'testuser' 
UNION  
SELECT metric.username, weight.weight,
metric.speedOrDistance, workoutDate 
FROM playerweight as weight RIGHT
JOIN sprint AS metric ON weight.weighInDate = metric.workoutDate 
WHERE metric.username = 'testuser' 
ORDER BY weighInDate;

Example Data:

sprint Table

+----------+-------------+-----------------+
| username | workoutDate | speedOrDistance |
+----------+-------------+-----------------+
| jdoe     | 2016-10-10  |            2.00 |
| jdoe     | 2016-10-17  |            3.50 |
| jdoe     | 2016-10-24  |            3.00 |
| jdoe     | 2016-11-01  |            5.00 |
| jdoe     | 2016-11-10  |            4.00 |
| foo      | 2016-11-03  |            2.50 |
| foo      | 2016-11-14  |            5.00 |
| foo      | 2016-11-16  |            4.00 |
| foo      | 2016-11-18  |            3.00 |
| testuser | 2016-10-11  |            3.40 |
| testuser | 2016-10-17  |            3.40 |
| testuser | 2016-11-13  |            1.50 |
| testuser | 2016-11-15  |            2.00 |
| testuser | 2016-11-17  |            4.00 |
+----------+-------------+-----------------+

playerWeight

+----------+-------------+--------+
| username | weighInDate | weight |
+----------+-------------+--------+
| jdoe     | 2016-10-10  |    160 |
| jdoe     | 2016-10-17  |    160 |
| jdoe     | 2016-10-24  |    170 |
| jdoe     | 2016-11-01  |    180 |
| jdoe     | 2016-11-08  |    180 |
| jdoe     | 2016-11-09  |    200 |
| jdoe     | 2016-11-11  |    178 |
| jdoe     | 2016-11-22  |    195 |
| foo      | 2016-11-01  |    190 |
| foo      | 2016-11-10  |    185 |
| foo      | 2016-11-14  |    175 |
| foo      | 2016-11-18  |    180 |
| bar      | 2011-01-16  |    170 |
| bar      | 2011-07-16  |    177 |
| bar      | 2011-09-16  |    169 |
| testuser | 2016-11-04  |    150 |
| testuser | 2016-11-08  |    200 |
| testuser | 2016-11-11  |    195 |
| testuser | 2016-11-14  |    175 |
| testuser | 2016-11-17  |    180 |
+----------+-------------+--------+

Incorrect Result Table

+----------+--------+-----------------+-------------+
| username | weight | speedOrDistance | weighInDate |
+----------+--------+-----------------+-------------+
| testuser | NULL   | 3.4             | 2016-10-11  |
| testuser | 160    | 3.4             | 2016-10-17  | -- ERROR ROW, weight should not have a value
| testuser | 150    | NULL            | 2016-11-04  |
| testuser | 200    | NULL            | 2016-11-08  |
| testuser | 195    | NULL            | 2016-11-11  |
| testuser | NULL   | 1.5             | 2016-11-13  |
| testuser | 175    | 5               | 2016-11-14  | -- ERROR ROW, speedOrDistance should not have a value
| testuser | NULL   | 2               | 2016-11-15  |
| testuser | 180    | 4               | 2016-11-17  |
+----------+--------+-----------------+-------------+

Ideal Result Table

+----------+--------+-----------------+-------------+
| username | weight | speedOrDistance | weighInDate |
+----------+--------+-----------------+-------------+
| testuser | NULL   | 3.4             | 2016-10-11  |
| testuser | NULL   | 3.4             | 2016-10-17  |
| testuser | 150    | NULL            | 2016-11-04  |
| testuser | 200    | NULL            | 2016-11-08  |
| testuser | 195    | NULL            | 2016-11-11  |
| testuser | NULL   | 1.5             | 2016-11-13  |
| testuser | 175    | NULL            | 2016-11-14  |
| testuser | NULL   | 2               | 2016-11-15  |
| testuser | 180    | 4               | 2016-11-17  |
+----------+--------+-----------------+-------------+

Any help that you guys could offer would be greatly appreciated. We have no idea why this isn't working.

Thank You!!!


Solution

  • You do the joins on workoutDate = weighInDate. This gives you combinations of unrelated rows, which remain in the result set, even with unrelated user names.

    To fix this, you must join on both dates and usernames, e.g.

    LEFT JOIN sprint AS metric ON weight.weighInDate = metric.workoutDate
                                  and weight.username = metric.username
    

    similarly on right join, of course.


    Added to the answer because it helps illustrate the cause of the problem. The @OlafDietsche has the correct solution. I just wanted you to see where the actual wrong data came from.

    To show you the problem with your join this record:

    | testuser | 2016-10-17  |            3.40
    

    is joining to this record

    | jdoe     | 2016-10-17  |    160 |
    

    because you only joined on date.