Search code examples
sqlsubqueryscalar-subquery

Why the output of a SELECT can be another SELECT?


I am rather confused about the following SQL query:

SELECT (SELECT S.name FROM student AS S
    WHERE S.sid = E.sid) AS sname
FROM enrolled as E
WHERE cid='15-455';

SELECT should be followed by an output, but why here there is another SELECT? How to understand the step-by-step meaning of this query?

The following is the query that has the same result of the above query, but its meaning is rather explicit: the output of the second SELECT is passed into the IN() function.

SELECT name FROM student
WHERE sid IN ( 
    SELECT sid FROM enrolled
    WHERE cid = '15-445'
);

Here are the original tables of this question:

mysql> select * from student;
+-------+--------+------------+------+---------+
| sid   | name   | login      | age  | gpa     |
+-------+--------+------------+------+---------+
| 53666 | Kanye  | kayne@cs   |   39 | 4.00000 |
| 53688 | Bieber | jbieber@cs |   22 | 3.90000 |
| 53655 | Tupac  | shakur@cs  |   26 | 3.50000 |
+-------+--------+------------+------+---------+

mysql> select * from enrolled;
+-------+--------+-------+
| sid   | cid    | grade |
+-------+--------+-------+
| 53666 | 15-445 | C     |
| 53688 | 15-721 | A     |
| 53688 | 15-826 | B     |
| 53655 | 15-445 | B     |
| 53666 | 15-721 | C     |
+-------+--------+-------+

mysql> select * from course;
+--------+------------------------------+
| cid    | name                         |
+--------+------------------------------+
| 15-445 | Database Systems             |
| 15-721 | Advanced Database Systems    |
| 15-826 | Data Mining                  |
| 15-823 | Advanced Topics in Databases |
+--------+------------------------------+

Solution

  • This construct:

    SELECT (SELECT S.name FROM student S WHERE S.sid = E.sid) AS sname
    -------^
    

    is called a scalar subquery. This is a special type of subquery that has two important properties:

    • It returns one column.
    • It returns at most one row.

    In this case, the scalar subquery is also a correlated subquery meaning that it references columns in the outer query, via the where clause.

    A scalar subquery can be using almost anywhere that a scalar (i.e. constant value) can be used in a query. They can be handy. They are not exactly equivalent to a join, because:

    1. An inner join can filter values. A scalar subquery returns NULL if there are no rows returned.
    2. A join can multiply the number of rows. A scalar subquery returns an error if it returns more than one row.