I have a query that looks like this:
SELECT 'FY2000' AS FY, COUNT(DISTINCT SGBSTDN_PIDM) AS CHEM_MAJORS
FROM SATURN.SGBSTDN, SATURN.SFRSTCR
WHERE SGBSTDN_PIDM = SFRSTCR_PIDM
AND SGBSTDN_TERM_CODE_EFF = (SELECT MAX(SGBSTDN_TERM_CODE_EFF)
FROM SATURN.SGBSTDN
WHERE SGBSTDN_TERM_CODE_EFF <= '200002'
AND SGBSTDN_PIDM = SFRSTCR_PIDM)
AND SGBSTDN_MAJR_CODE_1 = 'CHEM'
AND SFRSTCR_TERM_CODE BETWEEN '199905' AND '200002'
AND (SFRSTCR_RSTS_CODE LIKE 'R%' OR SFRSTCR_RSTS_CODE LIKE 'W%')
AND SFRSTCR_CREDIT_HR >= 1
It returns a count of 48, which I believe is correct. However, I don't understand why the subquery doesn't need SATURN.SFRSTCR
in the FROM clause in order to reference SFRSTCR_PIDM
. I thought subqueries were self contained and couldn't see the rest of the query?
But, if I add SATURN.SFRSTCR to the subquery, the count changes to 22. If I take the AND SGBSTDN_PIDM = SFRSTCR_PIDM
out of the subquery, the count also changes to 22. Can someone explain this to me?
You have a correlated subquery. This is a bit different from a non-correlated subquery, because it can include references to outer tables.
When using correlated subqueries, always use the table aliases for all table references. This is a good idea in general, but should be followed more attentively for correlated subqueries.
AND SGBSTDN_TERM_CODE_EFF = (SELECT MAX(SGBSTDN.SGBSTDN_TERM_CODE_EFF)
FROM SATURN.SGBSTDN
WHERE SGBSTDN.SGBSTDN_TERM_CODE_EFF <= '200002'
AND SGBSTDN.SGBSTDN_PIDM = SFRSTCR.SFRSTCR_PIDM
)
For each value of SFRSTCR.SFRSTCR_PIDM
(and the other conditions), the subquery is getting the maximum date.
In most versions of SQL, correlated subqueries are allowed in the from
, where
, and having
clauses. (They might also be allowed in order by
.)