Search code examples
mysqlsqlrecursive-querylookup-tables

How to create a SQL call that will recursively get DISTINCT ids and parent ids combined into a single column?


I have a company table that references itself, like this:

company_id    parent_id
-----------------------
1             (NULL)   
2             1        
3             2             
4             1
5             (NULL)

So a company can have a parent company and so forth.

I also have a lookup table to connect a customer to their related company's, like such:

customer_id     company_id
--------------------------
1               1
1               3
1               4
2               3
2               2

I'm trying to create a SQL call that will get a list of distinct company_id's that are related to a customer_id. This means I need to get the company ids of not just the one to one relationships from the lookup table, but also run through all the parents, add the parent ids if they aren't already in the list, then look at the parents' parents, add their ids if they aren't already in the list, and so forth.

So for example, in the tables above if I were to try to find all the company_id's where customer_id = 2, I would expect to get back:

company_id
---------------
1
2
3

Notice it got 1 recursively, but it did not list 2 twice because it's only looking for distinct id's.

I'm pretty lost when it comes to recursive SQL calls though. Is this possible or should I do this in code?


Solution

  • In MySQL 8.0 and above you can use a recursive CTE to traverse the hierarchy. But as your customers already have some parent companies of the companies they have assigned (which seems a little odd to me), that will result in duplicates. So you need to get the distinct set. For convenience I use an other CTE for that, but you could also skip that and do the DISTINCT directly in your query. Then you can simply query for the companies of a customer from the CTE.

    WITH RECURSIVE
    cte1 AS
    (
    SELECT cu.customer_id,
           cu.company_id
           FROM customer cu
    UNION ALL
    SELECT ct.customer_id,
           co.parent_id company_id
           FROM cte1 ct
                INNER JOIN company co
                           ON ct.company_id = co.company_id
           WHERE co.parent_id IS NOT NULL
    ),
    cte2 AS
    (
    SELECT DISTINCT
           ct.customer_id,
           ct.company_id
           FROM cte1 ct
    )
    SELECT company_id
           FROM cte2
           WHERE customer_id = 2;
    

    DB Fiddle
    Of course you could also shift the condition on the customer ID into the anchor of the recursive CTE. That may reduce the costs for the query if you ever only need this for exactly one customer.