Search code examples
mysqlsqlinner-join

MySQL: query with two many to many relations and duplicates, with full data from subqueries


This question extends one I just asked, and that was just solved here: https://stackoverflow.com/posts/62861992/edit

But there is need for a new question, as there is one detail that was missing in the other one. But first things first.

I have four models: articles, authors and tags. Each article can have many authors, and also can have many tags.

So my DB will have the following tables:

article
article_author
author
article_tag
tags

Here in MySQL:

DROP TABLE IF EXISTS article_tag;
DROP TABLE IF EXISTS article_author;
DROP TABLE IF EXISTS author;
DROP TABLE IF EXISTS tag;
DROP TABLE IF EXISTS article;

CREATE TABLE IF NOT EXISTS author (
  id INT(11) NOT NULL AUTO_INCREMENT,
  name VARCHAR(255),
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS article (
  id INT(11) NOT NULL AUTO_INCREMENT,
  title VARCHAR(255),
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS tag (
  id INT(11) NOT NULL AUTO_INCREMENT,
  tag VARCHAR(255),
  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS article_author (
  article_id INT(11) NOT NULL,
  author_id INT(11) NOT NULL,
  PRIMARY KEY (article_id, author_id),
  INDEX fk_article_author_article_idx (article_id ASC) VISIBLE,
  INDEX fk_article_author_author_idx (author_id ASC) VISIBLE,
  CONSTRAINT fk_article_author_article
    FOREIGN KEY (article_id)
    REFERENCES article (id),
  CONSTRAINT fk_article_author_author
    FOREIGN KEY (author_id)
    REFERENCES author (id)
);

CREATE TABLE IF NOT EXISTS article_tag (
  article_id INT(11) NOT NULL,
  tag_id INT(11) NOT NULL,
  PRIMARY KEY (article_id, tag_id),
  INDEX fk_article_tag_article_idx (article_id ASC) VISIBLE,
  INDEX fk_article_tag_tag_idx (tag_id ASC) VISIBLE,
  CONSTRAINT fk_article_tag_article
    FOREIGN KEY (article_id)
    REFERENCES article (id),
  CONSTRAINT fk_article_tag_tag
    FOREIGN KEY (tag_id)
    REFERENCES tag (id)
);

And we can insert some data in our DB:

INSERT INTO article (id, title) VALUES (1, 'first article'), (2, 'second article'), (3, 'third article');
INSERT INTO author (id, name) VALUES (1, 'first author'), (2, 'second author'), (3, 'third author'), (4, 'fourth author');
INSERT INTO tag (id, tag) VALUES (1, 'first tag'), (2, 'second tag'), (3, 'third tag'), (4, 'fourth tag'), (5, 'fifth tag');
INSERT INTO article_tag (article_id, tag_id) VALUES (1, 1), (1, 2), (1, 3), (2, 2), (2, 4), (2, 5), (3, 1), (3, 2);
INSERT INTO article_author (article_id, author_id) VALUES (1, 1), (1, 2), (1, 3), (2, 2), (2, 4), (3, 1), (3, 2), (3, 3), (3, 4);

Now, in the previous question I was retrieving the articles, and for every article I retrieved the related author ids as well as tag ids. The solution proposed is really elegant:

SELECT a.id, a.title,
  (SELECT JSON_ARRAYAGG(aa.author_id)
    FROM article_author aa
    WHERE a.id = aa.article_id
  ) as authors,
    (SELECT JSON_ARRAYAGG(art.tag_id)
    FROM article_tag art
    WHERE a.id = art.article_id
  ) as tags
FROM article a;

But now I want to retrieve the author ids, as well as their names; the same with the tags: tag id and the tag itself.

Apparently this is not a trivial question, but I would like to know what are my options to achieve that.


Solution

  • You can get the data you want using JOINs in the subquery. If you want just the names, you can use:

    SELECT a.id, a.title,
           (SELECT JSON_ARRAYAGG(au.name)
            FROM article_author aa JOIN
                 author au
                 ON au.id = aa.author_id
            WHERE a.id = aa.article_id
           ) as authors,
           (SELECT JSON_ARRAYAGG(t.tag)
            FROM article_tag art JOIN
                 tag t
                 ON art.tag_id = t.id
            WHERE a.id = art.article_id
           ) as tags
    FROM article a;
    

    I'm not sure what data structure you want with both the ids and the names. If you want an array of JSON objects with two fields in each object:

    SELECT a.id, a.title,
           (SELECT JSON_ARRAYAGG(JSON_OBJECT('name', au.name, 'id', au.id))
            FROM article_author aa JOIN
                 author au
                 ON au.id = aa.author_id
            WHERE a.id = aa.article_id
           ) as authors,
           (SELECT JSON_ARRAYAGG(JSON_OBJECT('tag', t.tag, 'id', t.id))
            FROM article_tag art JOIN
                 tag t
                 ON art.tag_id = t.id
            WHERE a.id = art.article_id
           ) as tags
    FROM article a;
    

    Here is a db<>fiddle for this version.