I have three tables:
Person
+--------+-----------+
| fName | lName |
+--------+-----------+
| Paul | McCartney |
| John | Lennon |
| Jon | Stewart |
| Daniel | Tosh |
| Steven | Colbert |
| Pink | Floyd |
| The | Beatles |
| Arcade | Fire |
| First | Last |
| Andrew | Bird |
+--------+-----------+
Publication
+----+---------------------------------------+------+-----------+---------+
| id | title | year | pageStart | pageEnd |
+----+---------------------------------------+------+-----------+---------+
| 9 | The Dark Side of the Moon | 1973 | 0 | 0 |
| 10 | Piper At The Gates of Dawn | 1967 | 0 | 0 |
| 11 | Sgt. Pepper's Lonely Hearts Band Club | 1967 | 0 | 0 |
| 12 | Happy Thoughts | 2007 | 0 | 60 |
| 13 | Wish You Were Here | 1975 | 0 | 0 |
| 14 | Funeral | 2004 | 0 | 0 |
+----+---------------------------------------+------+-----------+---------+
Person_Publication
+-----------+----------------+--------+---------------+
| person_id | publication_id | editor | author_number |
+-----------+----------------+--------+---------------+
| 11 | 11 | 0 | 1 |
| 12 | 11 | 0 | 1 |
| 16 | 9 | 0 | 1 |
| 17 | 11 | 0 | 1 |
+-----------+----------------+--------+---------------+
I'm trying to select all authors of a certain publication using the following query:
SELECT fName , lName
FROM Publication , Person, Person_Publication
WHERE Person.id = Person_Publication.person_id
AND Person_Publication.publication_id = 11;
But the results I get are always duplicates (always 6x for some reason). The results:
+-------+-----------+
| fName | lName |
+-------+-----------+
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
| Paul | McCartney |
| John | Lennon |
| The | Beatles |
+-------+-----------+
18 rows in set (0.03 sec)
Can somebody please tell me why this is happening and how to fix this?
You are including three tables in your query:
FROM Publication, Person, Person_Publication
but you have only one join condition:
WHERE Person.id = Person_Publication.person_id
You end up with a cartesian product between Publication
and Person JOIN Person_Publication
Add the following condition to your WHERE
block:
AND Publication.id = Person_Publication.publication.id
A perfect example of why the explicit JOIN syntax is prefered. With the following syntax:
SELECT fName, lName
FROM Publication
JOIN Person_Publication ON Person_Publication.publication.id = Publication.id
JOIN Person ON Person.id = Person_Publication.person_id
WHERE Person_Publication.publication_id = 11;
.. such a mistake simply cannot happen.