I have a table with two rows: IMDB_ID and Actor. I am trying to find the pairs of actors who co-stared in 3 or more movies. The pairs of names should be unique, meaning that ‘actor A, actor B’ and ‘actor B, actor A’ are the same pair, so only one of them should appear. Here's a few lines of the table, but not the whole thing:
IMDB_ID ACTOR
---------- -----------
tt0111161 Tim Robbins
tt0111161 Morgan Free
tt0111161 Bob Gunton
tt0111161 William Sad
tt0111161 Clancy Brow
tt0111161 Gil Bellows
tt0111161 Mark Rolsto
tt0111161 James Whitm
tt0111161 Jeffrey DeM
tt0111161 Larry Brand
tt0111161 Neil Giunto
tt0111161 Brian Libby
tt0111161 David Prova
tt0111161 Joseph Ragn
tt0111161 Jude Ciccol
tt0068646 Marlon Bran
tt0068646 Al Pacino
I tried:
SELECT DISTINCT movie_actor.actor, movie_actor.actor, COUNT(movie_actor.actor) AS occurrence
FROM movie_actor join movie_actor
ON movie_actor.imdb_id = movies.imdb_id
WHERE occurrence >= 3
GROUP BY movie_actor.actor
ORDER BY occurrence DESC, movie_actor.actor ASC;
and received an operational error that said: ambiguous column name: movie_actor.actor
You should be using table aliases when using a self-join. However, your query has numerous problems, indicating that you need more practice in writing SQL. Some good habits:
having
clause when that is your intention.Here is SQL closer to what you want to do:
SELECT ma.actor, ma2.actor, COUNT(*) AS occurrence
FROM movie_actor ma join
movie_actor ma2
ON ma.imdb_id = ma2.imdb_id
GROUP BY ma.actor, ma2.actor
HAVING COUNT(*) >= 3
ORDER BY occurrence DESC, ma.actor ASC;