Search code examples
mysqlrecursioncommon-table-expression

mysql cte recursive to get all siblings


I'm trying to get all siblings / connections of given user_id, based on phone_id(s). My table looks like the following (ad_users_id and ad_phones_id are foreign keys of specific tables holding details about users / phones).

id ad_users_id ad_phones_id date_time
90 7177 19515
125 1967 202
235 7177 19868
236 7177 19519
237 7177 19516
238 7177 19517
239 7177 20196
245 7177 19585
247 7177 19701
295 7177 19703
342 3915 19868
343 3915 19519
346 3915 19516
426 3915 19517
368 3915 20196
414 3915 19585
403 3915 19701
404 3915 19703
1707 1962 202
2124 1967 25141
2129 9329 25141

Basically I need to get all direct and indirect connections of given user_id. I managed to build a mysql query (recursive CTE) which doesn't look really clean, but looks like it also doesn't work for users with several phone_id(s), like user_id 7177.

WITH RECURSIVE connections (table_id, user_id, phone_id)
AS (
  SELECT
    CAST(ut.id AS CHAR(200)) AS table_id,
    ut.ad_users_id AS user_id,
    ut.ad_phones_id AS phone_id
  FROM ad_phones_users_taxonomy ut
  WHERE ut.ad_users_id = 7177  -- Starting user
  UNION ALL
  SELECT
    CONCAT(c.table_id, ',', CAST(ut2.id AS CHAR(200))) AS table_id, -- CONCATENARE pentru a evita din bucla id-urile deja trase
    ut2.ad_users_id AS user_id,
    ut2.ad_phones_id AS phone_id
  FROM ad_phones_users_taxonomy ut2
  INNER JOIN connections c ON ut2.ad_phones_id = c.phone_id OR ut2.ad_users_id = c.user_id
    WHERE NOT FIND_IN_SET(ut2.id, c.table_id)
        LIMIT 10
)
SELECT
  table_id, user_id, phone_id
    FROM connections

Any suggestion would be much appreciated.

Expecting to retrieve all connections (direct and indirect), but for user_id(s) with multiple phone_id(s) almost "kills" mysql server

--- later edit --- This is the CREATE TABLE:

CREATE TABLE `ad_phones_users_taxonomy` (
  `id` int NOT NULL PRIMARY KEY AUTO_INCREMENT,
  `ad_users_id` int DEFAULT NULL,
  `ad_phones_id` int DEFAULT NULL,
  `data_timp` timestamp NULL DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

INSERT INTO `ad_phones_users_taxonomy` (`id`, `ad_users_id`, `ad_phones_id`, `date_time`) VALUES
(90, 7177, 19515, '2024-05-08 19:57:41'),
(125, 1967, 202, '2024-05-08 19:58:26'),
(235, 7177, 19868, '2024-05-08 20:02:52'),
(236, 7177, 19519, '2024-05-08 20:02:55'),
(237, 7177, 19516, '2024-05-08 20:02:58'),
(238, 7177, 19517, '2024-05-08 20:03:01'),
(239, 7177, 20196, '2024-05-08 20:03:04'),
(245, 7177, 19585, '2024-05-08 20:03:25'),
(247, 7177, 19701, '2024-05-08 20:03:31'),
(295, 7177, 19703, '2024-05-08 20:05:55'),
(342, 3915, 19868, '2024-05-08 20:08:21'),
(343, 3915, 19519, '2024-05-08 20:08:24'),
(346, 3915, 19516, '2024-05-08 20:08:33'),
(426, 3915, 19517, '2024-05-08 20:12:27'),
(368, 3915, 20196, '2024-05-08 20:09:36'),
(414, 3915, 19585, '2024-05-08 20:11:53'),
(403, 3915, 19701, '2024-05-08 20:11:21'),
(404, 3915, 19703, '2024-05-08 20:11:24'),
(1707, 1962, 202, NULL),
(2124, 1967, 25141, '2024-05-13 19:27:55'),
(2129, 9329, 25141, '2024-05-13 19:45:32');

For any of the user_id(s): 1962, 1967, 9329 this would be the expected outcome:

id ad_users_id ad_phones_id date_time
125 1967 202
1707 1962 202
2124 1967 25141
2129 9329 25141

For any of user_id(s) 7177 or 3915, expected outcome:

id ad_users_id ad_phones_id date_time
90 7177 19515
235 7177 19868
236 7177 19519
237 7177 19516
238 7177 19517
239 7177 20196
245 7177 19585
247 7177 19701
295 7177 19703
342 3915 19868
343 3915 19519
346 3915 19516
426 3915 19517
368 3915 20196
414 3915 19585
403 3915 19701
404 3915 19703

Solution

  • CREATE PROCEDURE get_all (in_user_id INT)
    WITH RECURSIVE
    cte AS (
    
    -- anchor part - select all rows for specified user
    
      SELECT ad_users_id, ad_phones_id
      FROM ad_phones_users_taxonomy
      WHERE ad_users_id = in_user_id
    
    -- use UNION DISTINCT which will remove duplicates
    -- including the users which are processed/collected already
    
      UNION DISTINCT
    
    -- recursive part - add siblings 
    -- join one table copy by the phone then another copy by the user id
    -- i.e. get a user with the same phone from t1 
    -- then get all phones for this user from t2
    
      SELECT t2.ad_users_id, t2.ad_phones_id
      FROM cte 
      JOIN ad_phones_users_taxonomy t1 USING (ad_phones_id)
      JOIN ad_phones_users_taxonomy t2 ON t1.ad_users_id = t2.ad_users_id
      )
    
    -- finally retrieve original rows for gathered users
    
    SELECT DISTINCT ad_phones_users_taxonomy.*
    FROM ad_phones_users_taxonomy
    JOIN cte USING (ad_users_id)
    ORDER BY id;
    

    fiddle

    PS. The query is placed into stored procedure which allows do not repeat the whole text avoiding misprints. You may use the query only, of course.