Search code examples
sqlmysqlquery-optimizationleft-join

Left join or select from multiple table using comma (,)


I'm curious as to why we need to use LEFT JOIN since we can use commas to select multiple tables.

What are the differences between LEFT JOIN and using commas to select multiple tables.

Which one is faster?

Here is my code:

   SELECT mw.*, 
          nvs.* 
     FROM mst_words mw 
LEFT JOIN (SELECT no as nonvs, 
                  owner, 
                  owner_no, 
                  vocab_no, 
                  correct 
             FROM vocab_stats 
            WHERE owner = 1111) AS nvs ON mw.no = nvs.vocab_no 
    WHERE (nvs.correct > 0 ) 
      AND mw.level = 1

...and:

SELECT * 
  FROM vocab_stats vs, 
       mst_words mw 
 WHERE mw.no = vs.vocab_no 
   AND vs.correct > 0 
   AND mw.level = 1 
   AND vs.owner = 1111

Solution

  • First of all, to be completely equivalent, the first query should have been written

       SELECT mw.*, 
              nvs.* 
         FROM mst_words mw 
    LEFT JOIN (SELECT *
                 FROM vocab_stats 
                WHERE owner = 1111) AS nvs ON mw.no = nvs.vocab_no 
        WHERE (nvs.correct > 0 ) 
          AND mw.level = 1
    

    So that mw.* and nvs.* together produce the same set as the 2nd query's singular *. The query as you have written can use an INNER JOIN, since it includes a filter on nvs.correct.

    The general form

    TABLEA LEFT JOIN TABLEB ON <CONDITION>
    

    attempts to find TableB records based on the condition. If the fails, the results from TABLEA are kept, with all the columns from TableB set to NULL. In contrast

    TABLEA INNER JOIN TABLEB ON <CONDITION>
    

    also attempts to find TableB records based on the condition. However, when fails, the particular record from TableA is removed from the output result set.

    The ANSI standard for CROSS JOIN produces a Cartesian product between the two tables.

    TABLEA CROSS JOIN TABLEB
      -- # or in older syntax, simply using commas
    TABLEA, TABLEB
    

    The intention of the syntax is that EACH row in TABLEA is joined to EACH row in TABLEB. So 4 rows in A and 3 rows in B produces 12 rows of output. When paired with conditions in the WHERE clause, it sometimes produces the same behaviour of the INNER JOIN, since they express the same thing (condition between A and B => keep or not). However, it is a lot clearer when reading as to the intention when you use INNER JOIN instead of commas.

    Performance-wise, most DBMS will process a LEFT join faster than an INNER JOIN. The comma notation can cause database systems to misinterpret the intention and produce a bad query plan - so another plus for SQL92 notation.

    Why do we need LEFT JOIN? If the explanation of LEFT JOIN above is still not enough (keep records in A without matches in B), then consider that to achieve the same, you would need a complex UNION between two sets using the old comma-notation to achieve the same effect. But as previously stated, this doesn't apply to your example, which is really an INNER JOIN hiding behind a LEFT JOIN.

    Notes:

    • The RIGHT JOIN is the same as LEFT, except that it starts with TABLEB (right side) instead of A.
    • RIGHT and LEFT JOINS are both OUTER joins. The word OUTER is optional, i.e. it can be written as LEFT OUTER JOIN.
    • The third type of OUTER join is FULL OUTER join, but that is not discussed here.