Search code examples
mysqljoingroup-concat

group_concat with multiple joins in MySQL


Database schema

create table `questions` (
  `id` int not null auto_increment,
  `title` varchar(45) not null,
  primary key (`id`));

create table `tags` (
  `id` int not null auto_increment,
  `question_id` int not null,
  `name` varchar(45) not null,
  primary key (`id`));

create table `comments` (
  `id` int not null auto_increment,
  `question_id` int not null,
  `body` varchar(45) not null,
  primary key (`id`));

insert into questions (title) values
("title1"), ("title2"), ("title3");

insert into tags (question_id, name) values
(1, "javascript"), (1, "php"), (1, "c#"), (2, "mysql"), (2, "php"), (3, "c#");

insert into comments (question_id, body) values
(1, "comment1"), (1, "comment1"), (1, "comment2"), (3, "comment3");

That's how it looks visually:

questions table

| id |  title |
|----|--------|
|  1 | title1 |
|  2 | title2 |
|  3 | title3 |

tags table

| id | question_id |       name |
|----|-------------|------------|
|  1 |           1 | javascript |
|  2 |           1 |        php |
|  3 |           1 |         c# |
|  4 |           2 |      mysql |
|  5 |           2 |        php |
|  6 |           3 |         c# |

comments table

| id | question_id |     body |
|----|-------------|----------|
|  1 |           1 | comment1 |
|  2 |           1 | comment1 |
|  3 |           1 | comment2 |
|  4 |           3 | comment3 |

Each question has to have at least one tag. It can also have 0 or more comments. There can be two comments with the same body on one question.

Desired output

I want to select all questions, that is, their ids, titles, tags and comments.

The output should look like that:

| id |  title |       tags        |          comments          |
|----|--------|-------------------|----------------------------|
|  1 | title1 | c#,php,javascript | comment1,comment1,comment2 |
|  2 | title2 | php,mysql         | (null)                     |
|  3 | title3 | c#                | comment3                   |

Attempts to solve the problem

I tried the following query:

select questions.id, questions.title,
  group_concat(tags.name), group_concat(comments.body)
from questions
join tags on questions.id = tags.question_id
left join comments on questions.id = comments.question_id
group by questions.id

Unfortunately, it doesn't work as expected. It produces the following output:

| id |  title | group_concat(distinct tags.name) |                                                      group_concat(comments.body) |
|----|--------|----------------------------------|----------------------------------------------------------------------------------|
|  1 | title1 |                c#,php,javascript | comment1,comment1,comment1,comment2,comment2,comment2,comment1,comment1,comment1 |
|  2 | title2 |                        php,mysql |                                                                           (null) |
|  3 | title3 |                               c# |                                                                         comment3 |

As you see, for the first question I get each comment three times, because there are three tags on this question.

Also, the comments are in the wrong order. They should be in the same order as they were inserted, that is, comment1,comment1,comment2, not comment1,comment2,comment1.

I can't use distinct for comments, as there can be multiple comments with the same body on one question.

I know that this probably could be solved with nested selects, but as far as I know it would have a huge negative impact on the performance of the query.

SQL Fiddle

The SQL Fiddle with database schema and my query.


Solution

  • You need to first aggregate and apply GROUP_CONCAT and then join:

    select questions.id, questions.title,
           tags.name, comments.body
    from questions
    join (
       select question_id, group_concat(tags.name) as name
       from tags
       group by question_id
    ) tags on questions.id = tags.question_id
    left join (
       select question_id, group_concat(comments.body) as body
       from comments
       group by question_id
    ) comments on questions.id = comments.question_id