Search code examples
sqlmysqlnode.jsrefactoring

Refactor code to avoid multiple SQL queries


I'm creating a simple e-commerce database in Node.js (with express) and MySQL (mysql2 package).

To get a single product, I need to fetch data from 3 different tables: products, product_images (with foreign key to product id) and reviews (also with foreign key to product id).

My 3 queries :

const productQuery = `
    SELECT p.name, p.reference AS ref, p.price, p.brand, p.stock_qty, p.product_subcategory_id AS subcategoryId, p.product_promotion_id AS promotionId
    FROM products p
    WHERE id = ?`;

const imagesQuery = `
    SELECT url
    FROM product_images
    WHERE product_id = ?`

const reviewsQuery = `
  SELECT r.content AS reviewContent, r.rate AS reviewRate, r.created_at AS reviewDate, r.reviews_user_id AS reviewUserId
  FROM reviews r
  WHERE reviews_product_id = ?
  ORDER BY reviewDate DESC`;

const [product] = await sql.query(productQuery, id);
const [images] = await sql.query(imagesQuery, id);
const [reviews] = await sql.query(reviewsQuery, id);

return { product, images, reviews };

The result (there is no image in my seed yet, so the array being empty is normal):

{
    "product": [
        {
            "name": "FAMILY CARE TRIPLE",
            "ref": "2c85da01",
            "price": "246.21",
            "brand": "Wordpedia",
            "stock_qty": "61",
            "subcategoryId": 21,
            "promotionId": null
        }
    ],
    "images": [],
    "reviews": [
        {
            "reviewContent": "Fusce posuere felis sed lacus. Morbi sem mauris, laoreet ut, rhoncus aliquet, pulvinar sed, nisl. Nunc rhoncus dui vel sem.",
            "reviewRate": 5,
            "reviewDate": "2023-03-04T11:12:13.000Z",
            "reviewUserId": 1
        },
        {
            "reviewContent": "Proin eu mi. Nulla ac enim. In tempor, turpis nec euismod scelerisque, quam turpis adipiscing lorem, vitae mattis nibh ligula nec sem.",
            "reviewRate": 4,
            "reviewDate": "2022-08-28T04:39:37.000Z",
            "reviewUserId": 1
        },
        {
            "reviewContent": "Praesent blandit. Nam nulla. Integer pede justo, lacinia eget, tincidunt eget, tempus vel, pede.",
            "reviewRate": 4,
            "reviewDate": "2022-08-19T17:05:23.000Z",
            "reviewUserId": 1
        },
        {
            "reviewContent": "Praesent id massa id nisl venenatis lacinia. Aenean sit amet justo. Morbi ut odio.",
            "reviewRate": 1,
            "reviewDate": "2022-08-19T16:05:44.000Z",
            "reviewUserId": 1
        },
        {
            "reviewContent": "Aenean lectus. Pellentesque eget nunc. Donec quis orci eget orci vehicula condimentum.",
            "reviewRate": 1,
            "reviewDate": "2022-06-18T00:50:21.000Z",
            "reviewUserId": 2
        }
    ]
}

The result is the one I want, but I don't think making 3 queries to the database is optimize.

At first I tried this query (just to get product info and related reviews, but not the images)

SELECT 
    p.name, p.reference AS ref, p.price, p.brand, p.stock_qty, 
    p.product_subcategory_id AS subcategoryId, 
    p.product_promotion_id AS promotionId, r.content AS reviewContent, 
    r.rate AS reviewRate, r.created_at AS reviewDate, 
    r.reviews_user_id AS reviewUserId
FROM 
    products p
JOIN
    reviews r ON p.id = r.reviews_product_id
WHERE 
    p.id = ?

The result :

[
    {
        "name": "FAMILY CARE TRIPLE",
        "ref": "2c85da01",
        "price": "246.21",
        "brand": "Wordpedia",
        "stock_qty": "61",
        "subcategoryId": 21,
        "promotionId": null,
        "reviewContent": "Proin eu mi. Nulla ac enim. In tempor, turpis nec euismod scelerisque, quam turpis adipiscing lorem, vitae mattis nibh ligula nec sem.",
        "reviewRate": 4,
        "reviewDate": "2022-08-28T04:39:37.000Z",
        "reviewUserId": 1
    },
    {
        "name": "FAMILY CARE TRIPLE",
        "ref": "2c85da01",
        "price": "246.21",
        "brand": "Wordpedia",
        "stock_qty": "61",
        "subcategoryId": 21,
        "promotionId": null,
        "reviewContent": "Aenean lectus. Pellentesque eget nunc. Donec quis orci eget orci vehicula condimentum.",
        "reviewRate": 1,
        "reviewDate": "2022-06-18T00:50:21.000Z",
        "reviewUserId": 2
    },
    {
        "name": "FAMILY CARE TRIPLE",
        "ref": "2c85da01",
        "price": "246.21",
        "brand": "Wordpedia",
        "stock_qty": "61",
        "subcategoryId": 21,
        "promotionId": null,
        "reviewContent": "Fusce posuere felis sed lacus. Morbi sem mauris, laoreet ut, rhoncus aliquet, pulvinar sed, nisl. Nunc rhoncus dui vel sem.",
        "reviewRate": 5,
        "reviewDate": "2023-03-04T11:12:13.000Z",
        "reviewUserId": 1
    },
    {
        "name": "FAMILY CARE TRIPLE",
        "ref": "2c85da01",
        "price": "246.21",
        "brand": "Wordpedia",
        "stock_qty": "61",
        "subcategoryId": 21,
        "promotionId": null,
        "reviewContent": "Praesent blandit. Nam nulla. Integer pede justo, lacinia eget, tincidunt eget, tempus vel, pede.",
        "reviewRate": 4,
        "reviewDate": "2022-08-19T17:05:23.000Z",
        "reviewUserId": 1
    },
    {
        "name": "FAMILY CARE TRIPLE",
        "ref": "2c85da01",
        "price": "246.21",
        "brand": "Wordpedia",
        "stock_qty": "61",
        "subcategoryId": 21,
        "promotionId": null,
        "reviewContent": "Praesent id massa id nisl venenatis lacinia. Aenean sit amet justo. Morbi ut odio.",
        "reviewRate": 1,
        "reviewDate": "2022-08-19T16:05:44.000Z",
        "reviewUserId": 1
    }
]

My probleme is that, even if the result is technically correct (I got all the data I asked for), it is harder to work with it (compare to my current code), and I guess it is not very efficient since it has some data multiple times (the same data about the product is repeated for every different review).

So my question: is there a way to get a result as my actual one, but with a single query?

And if not should I stay with my actual code, or should I use my first one?


Solution

  • I would continue to use three queries. If you could do this with a single query, the query would be very complicated and difficult to maintain. Also you would have to write a lot of code after fetching the result to separate the results into the structure you want.

    In a comment above, Shadow referred to a SQL query that can produce JSON-structured results. This would look something like this (for example purposes, not intended as a solution to your case):

    SELECT JSON_OBJECT(
      'product', p.products,
      'images', i.images,
      'reviews', r.reviews
    ) AS _result
    FROM (
      SELECT JSON_ARRAYAGG(
        JSON_OBJECT(
          'name', name,
          'ref', reference,
          'price', price,
          'brand', brand,
          'stock_qty', stock_qty,
          'subcategoryId', product_subcategory_id,
          'promotionId', 'product_promotion_id
        )
      ) AS products
      FROM products
      WHERE id = ?
    ) AS p
    CROSS JOIN (
      ...similar for product_images table...
    ) AS i
    CROSS JOIN (
      ...similar for reviews table...
    ) AS r;
    

    This is more difficult to write, more difficult to explain to another developer, and it's hard to add new elements if you need to do that in the future.

    Sometimes the simpler approach is better for the lifetime of your code.

    It makes me recall this classic quote from Brian Kernighan:

    “Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?”

    Another comment: you said you guessed that three queries was not efficient. But this vague guess isn't a sufficient reason to try to make it into a single, more complicated query. You should have a more specific reason why the original implementation is not good enough, which you can support with some performance measurements.