Search code examples
mysqlsqlperformancequery-optimization

MySQL: How do I Optimize this JOIN Query?


I have 3 tables; artist, album, song_cover, and song. I would like to select an album and the total number of songs in that album. Am currently using this query, but it is logged in the mysql-slow.log file. In the PHPMyAdmin, the query speed is inconsistent. Sometimes it will execute for 0.0005 seconds and other times, 2 seconds or more.

SELECT /*+ MAX_EXECUTION_TIME(1000) */ album.*, 
artist_id, artist_aka, artist_slug, artist_profile_image, cover_filename,
(
   SELECT COUNT(*)
   FROM song
   WHERE song.song_album_id = album.album_id
) AS TotalSongs
FROM album 
LEFT JOIN artist ON album.album_artist = artist.artist_id 
LEFT JOIN song_cover ON album.album_cover_id = song_cover.cover_id 
ORDER BY album_id DESC LIMIT 0, 11

ROWS artist: 15,978, album: 14,167, song: 67,559, song_cover: 12,668

EXPLAIN enter image description here

Thank you in advance.


Solution

  • I would write it this way:

    EXPLAIN SELECT b.*,
      a.artist_id, a.artist_aka, a.artist_slug, a.artist_profile_image, 
      c.cover_filename,
      COUNT(*) AS TotalSongs
    FROM album AS b
    INNER JOIN artist AS a ON b.album_artist = a.artist_id
    LEFT OUTER JOIN song AS s ON s.song_album_id = b.album_id
    LEFT OUTER JOIN song_cover AS c ON b.album_cover_id = c.cover_id
    GROUP BY b.album_id
    ORDER BY b.album_id DESC LIMIT 0, 11;
    

    This eliminates the dependent subquery, in favor of another join and GROUP BY.

    Here's the EXPLAIN report as near as I can guess at it:

    +----+-------------+-------+------------+--------+-------------------------------------+---------------+---------+-----------------------+------+----------+---------------------+
    | id | select_type | table | partitions | type   | possible_keys                       | key           | key_len | ref                   | rows | filtered | Extra               |
    +----+-------------+-------+------------+--------+-------------------------------------+---------------+---------+-----------------------+------+----------+---------------------+
    |  1 | SIMPLE      | b     | NULL       | index  | PRIMARY,album_cover_id,album_artist | PRIMARY       | 4       | NULL                  |    1 |   100.00 | Backward index scan |
    |  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY                             | PRIMARY       | 4       | test.b.album_artist   |    1 |   100.00 | NULL                |
    |  1 | SIMPLE      | s     | NULL       | ref    | song_album_id                       | song_album_id | 4       | test.b.album_id       |    1 |   100.00 | Using index         |
    |  1 | SIMPLE      | c     | NULL       | eq_ref | PRIMARY                             | PRIMARY       | 4       | test.b.album_cover_id |    1 |   100.00 | NULL                |
    +----+-------------+-------+------------+--------+-------------------------------------+---------------+---------+-----------------------+------+----------+---------------------+
    

    I have no data in my tables, so the row counts are trivial.

    There's still a problem that it's doing an index-scan of album, which in your case is 14,167 rows. That could be costly.

    But the other joins are all using indexes. Two of them are type: eq_ref, indicating that it's joining to the primary key of those tables.

    I changed the join to artist to an inner join. I can't see how an album could not reference an artist. But I suppose it's possible for an album to have no songs, hence the outer join.

    I find it strange that you join album directly to song_cover. Wouldn't song_cover also need to reference the original song it's a cover of?