Search code examples
mysqlinner-joincommon-table-expressionrecursive-cte

Understanding inner join in recursive CTE (MySQL)


The code below is from: https://www.freecodecamp.org/news/mysql-common-table-expressions/

Data:

CREATE TABLE
categories (
    id int,
    cat_name varchar(100),
    parent_category_id int DEFAULT NULL
);

INSERT INTO
    categories
VALUES
    (1, 'Mens', NULL),
    (2, 'Tops', 1),
    (3, 'Jerseys', 2),
    (4, 'England', 3);

Code:

WITH RECURSIVE
category_tree AS (
    SELECT
        id,
        cat_name,
        parent_category_id,
        cat_name AS full_name
    FROM categories
    WHERE parent_category_id IS NULL                
    UNION ALL                                       
    SELECT
        c.id,                                       
        c.cat_name,                                 
        c.parent_category_id,
        CONCAT (ct.full_name, ' > ', c.cat_name)
    FROM categories c
        JOIN category_tree ct 
        ON c.parent_category_id = ct.id             
)
SELECT
    id, full_name
FROM
    category_tree;
categories:
| id | cat_name | parent_category_id |
| ---| ---------| -------------------|
| 1  | Mens     | NULL               |
| 2  | Tops     | 1                  |
| 3  | Jerseys  | 2                  | 
| 4  | England  | 3                  |

category_tree: 
| id | full_name                       |
| ---| ------------------------------- |
| 1  | Mens                            |
| 2  | Mens > Tops                     |
| 3  | Mens > Tops > Jerseys           |
| 4  | Mens > Tops > Jerseys > England |
  • I have trouble understanding this line: ON c.parent_category_id = ct.id
  • From my understanding, ct.id will always return id: 1 and cat_name: 'Mens'.
  • Shouldn't the category_tree CTE stop at "Mens > Tops"?

I would be grateful for any detailed explanations. Thank you very much.


Solution

  • See example. I added some additional data to your example to make the answer clearer.

    Your data (source)

    id cat_name parent_category_id
    1 Mens null
    2 Tops 1
    3 Jerseys 2
    4 England-1 3
    5 England-2 3

    And recursive query

    WITH RECURSIVE category_tree AS (
       SELECT 0 as lvl, id, cat_name, parent_category_id,
          cat_name AS full_name
       FROM categories
       WHERE parent_category_id IS NULL                
       UNION ALL                                       
       SELECT lvl+1 as lvl,c.id, c.cat_name,c.parent_category_id,
          CONCAT (ct.full_name, ' > ', c.cat_name)
       FROM category_tree ct 
       INNER JOIN categories c
                ON c.parent_category_id = ct.id             
        )
    SELECT * 
    FROM category_tree;
    
    lvl id cat_name parent_category_id full_name
    0 1 Mens null Mens
    1 2 Tops 1 Mens > Tops
    2 3 Jerseys 2 Mens > Tops > Jerseys
    3 4 England-1 3 Mens > Tops > Jerseys > England-1
    3 5 England-2 3 Mens > Tops > Jerseys > England-2

    Let's look at the request step by step. I'll add column lvl for step numbering.

    1. The first step is to execute anchor part of query
       SELECT 0 as lvl, id, cat_name, parent_category_id,
          cat_name AS full_name
       FROM categories
       WHERE parent_category_id IS NULL                
    

    Output is

    lvl id cat_name parent_category_id full_name
    0 1 Mens null Mens
    1. Second step - JOIN (INNER) above result with source
       SELECT lvl+1 as lvl,c.id, c.cat_name,c.parent_category_id,
          CONCAT (ct.full_name, ' > ', c.cat_name)
       FROM category_tree ct 
       INNER JOIN categories c
                ON c.parent_category_id = ct.id             
    

    This part output is

    lvl id cat_name parent_category_id full_name
    1 2 Tops 1 Mens > Tops

    3.Union with anchor result is

    lvl id cat_name parent_category_id full_name
    0 1 Mens null Mens
    1 2 Tops 1 Mens > Tops

    4.Next step - recursive part of query - JOIN step 3 result with source.
    We JOIN (INNER)

    lvl id cat_name parent_category_id full_name
    0 1 Mens null Mens
    1 2 Tops 1 Mens > Tops

    with source table.

    JOIN result is

    lvl id cat_name parent_category_id full_name
    2 3 Jerseys 2 Mens > Tops > Jerseys

    And after UNION ALL

    lvl id cat_name parent_category_id full_name
    0 1 Mens null Mens
    1 2 Tops 1 Mens > Tops
    2 3 Jerseys 2 Mens > Tops > Jerseys
    1. Next step - JOIN (INNER) above output with source. This step is interesting because JOIN will output the result in 2 rows.
      JOIN output is
    lvl id cat_name parent_category_id full_name
    3 4 England-1 3 Mens > Tops > Jerseys > England-1
    3 5 England-2 3 Mens > Tops > Jerseys > England-2

    After UNION ALL

    lvl id cat_name parent_category_id full_name
    0 1 Mens null Mens
    1 2 Tops 1 Mens > Tops
    2 3 Jerseys 2 Mens > Tops > Jerseys
    3 4 England-1 3 Mens > Tops > Jerseys > England-1
    3 5 England-2 3 Mens > Tops > Jerseys > England-2

    fiddle