Search code examples
mysqljoinleft-join

How to join 2 tables with one of them containing multiple values in a single column


I have two tables users and interests which i'm trying to join. Inside users table i have columns as id, name, interest, etc. The interest column contain multiple values as "1,2,3". My second table interests have 2 columns id and name as:

id | name
-------------
1  | business  
2  | farming 
3  | fishing 

What i want to do is join interests table with users table so i get the following output:

users table:

id | name     | interest | interest_name 
----------------------------------------------
1  | username | "1,2"    | "business, farming"
2  | username | "2,3"    | " farming, fishing"

I wrote the following query to achieve this:

select users.*, interests.name as interest_name
  from users
  left join interests on users.interest = interests.id;

Results i got:

id | name     | interest | interest_name
----------------------------------------
1  | username | "1,2"    | "business"
2  | username | "2,3"    | " farming"

Problem:

I'm only getting the name of first values from interest column whereas i want all the values from interest column i have already tried using group_concat and find_in_set but getting the same results.


Solution

  • In the case you cannot create an additional database table in order to normalize the data...

    Here's a solution that creates an ad hoc, temporary user_interests table within the query.

    SELECT users.id user_id, username, interests, interests.interest
    FROM users
    LEFT JOIN (
      SELECT
        users.id user_id,
        (SUBSTRING_INDEX(SUBSTRING_INDEX(users.interests, ',', ui.ui_id), ',', -1) + 0) ui_id
      FROM users
        LEFT JOIN (SELECT id AS ui_id FROM interests) ui 
          ON CHAR_LENGTH(users.interests) - CHAR_LENGTH(REPLACE(users.interests, ',', '')) >= (ui.ui_id - 1)
    ) user_interests ON users.id = user_interests.user_id
    LEFT JOIN interests ON user_interests.ui_id = interests.id
    ORDER BY user_id, ui_id;
    

    Outputs:

    user_id | username | interest_ids | interest
    --------+----------+--------------+---------
    1       | fred     | 3,4,8,6,10   | fishing
    1       | fred     | 3,4,8,6,10   | sports
    1       | fred     | 3,4,8,6,10   | religion
    1       | fred     | 3,4,8,6,10   | science
    1       | fred     | 3,4,8,6,10   | philanthropy
    2       | joe      | 7,11,8,9     | art
    2       | joe      | 7,11,8,9     | science
    2       | joe      | 7,11,8,9     | politics
    2       | joe      | 7,11,8,9     | cooking
    

    As you can see...

    SELECT
      users.id user_id,
      (SUBSTRING_INDEX(SUBSTRING_INDEX(users.interests, ',', ui.ui_id), ',', -1) + 0) ui_id
    FROM users
      LEFT JOIN (SELECT id AS ui_id FROM interests) ui 
        ON CHAR_LENGTH(users.interests) - CHAR_LENGTH(REPLACE(users.interests, ',', '')) >= (ui.ui_id - 1)
    

    ...builds and populates the temporary table user_interests with the users.interests field data normalized:

    user_id | ui_id
    --------+------
    1       | 3
    1       | 4
    1       | 6
    1       | 8
    1       | 10
    2       | 7
    2       | 8
    2       | 9
    2       | 11
    

    ...which is then LEFT JOIN'ed between the users and interests tables.


    Try it here: https://onecompiler.com/mysql/3yfhmgq3y

    -- create
    CREATE TABLE users (
      id INT PRIMARY KEY,
      username VARCHAR(20),
      interests VARCHAR(20)
    );
    
    CREATE TABLE interests (
      id INT PRIMARY KEY,
      interest VARCHAR(20)
    );
    
    -- insert
    INSERT INTO users VALUES (1, 'fred', '3,4,8,6,10'), (2, 'joe', '7,11,8,9');
    INSERT INTO interests VALUES (1, 'business'), (2, 'farming'), (3, 'fishing'), (4, 'sports'), (5, 'technology'), (6, 'religion'), (7, 'art'), (8, 'science'), (9, 'politics'), (10, 'philanthropy'), (11, 'cooking');
    
    -- select
    SELECT users.id user_id, username, interests, interests.interest
    FROM users
    LEFT JOIN (
      SELECT
        users.id user_id,
        (SUBSTRING_INDEX(SUBSTRING_INDEX(users.interests, ',', ui.ui_id), ',', -1) + 0) ui_id
      FROM users
        LEFT JOIN (SELECT id AS ui_id FROM interests) ui 
          ON CHAR_LENGTH(users.interests) - CHAR_LENGTH(REPLACE(users.interests, ',', '')) >= (ui.ui_id - 1)
    ) user_interests ON users.id = user_interests.user_id
    LEFT JOIN interests ON user_interests.ui_id = interests.id
    ORDER BY user_id, ui_id;
    

    Inspired by Leon Straathof's and fthiella's answers to this SO question.