Search code examples
sqldatabaserelational-division

SQL finding distinct pairs


I'm trying to figure out how to write an SQL query to find the distinct pairs of student ids that have the same quarter and year and class id and that have been in more than one classid together.

So the database looks like this

Studentid courseid quarter       year 
11035   1020       Fall          2012    
11035   1092       Fall          2012    
75234   3201       Winter        2012    
8871    1092       Fall          2013    
39077   1092       Fall          2013    
57923   9219       Winter        2013    
60973   9219       Winter        2013     
19992   3201       Winter        2013    
60973   8772       Spring        2013    
90421   8772       Spring        2013    
90421   2987       Spring        2013    
60973   2987       Spring        2013

the result that I am trying to get is:

Studentid student id  year quarter course id

60973   90421   2013    Spring  8772    
60973   90421   2013    Spring  2987

so far I have

SELECT s.studentid, 
       st.studentid, 
       st.year, 
       s.quarter, 
       st.courseid
  FROM enrolled s 
       LEFT JOIN enrolled st 
                 ON st.COURSEID = s.COURSEID
 WHERE s.QUARTER = st.quarter
   AND s.YEAR = st.year 
   AND s.studentid <> st.studentid

but it is giving me every combination that can be created


Solution

  • Self join twice - once for the match, and again for to find a different match:

    SELECT s.studentid, st.studentid, s.year, s.quarter, s.courseid
    FROM enrolled s
    JOIN enrolled st ON st.COURSEID = s.COURSEID
      AND st.quarter = s.quarter
      AND st.year = s.year
      AND s.studentid < st.studentid
    JOIN enrolled e2 ON e2.COURSEID != s.COURSEID
      AND e2.quarter = s.quarter
      AND e2.year = s.year
      AND e2.studentid = st.studentid
    

    Output:

    | STUDENTID | OTHER_STUDENTID | YEAR | QUARTER | COURSEID |
    |-----------|-----------------|------|---------|----------|
    |     60973 |           90421 | 2013 |  Spring |     2987 |
    |     60973 |           90421 | 2013 |  Spring |     8772 |
    

    See SQLFiddle

    Notes:

    • You want an (inner) join, not a left join
    • Use s.studentid < st.studentid to stop:
      • students joining to themselves, and
      • both sides of pairs from displaying
    • Use a different course for the second join but the same other student id