Search code examples
sqlperformancepostgresqlwindow-functionspostgresql-performance

SQL Window Functions - SELECT DISTINCT ORDER BY LIMIT


I have these 3 tables in my PostgreSQL database:

  • artist: id, name
  • album: id, title, year, artist_id
  • song: id, title, album_id

Basically each artist has multiple albums, and each album has multiple songs.

My query correctly returns 25 distinct artist.id's who have a song whose title begins with "The " ordered by the year of the album:

SELECT id
FROM (

    -- Remove the duplicate artists
    SELECT DISTINCT ON (a.id) id, row
    FROM (

        -- Get all matching artists 
        -- This is slow because there is no limit
        SELECT
            artist.id,
            row_number() OVER(ORDER BY album.year DESC) as row
        FROM artist
        LEFT JOIN album ON album.artist_id = artist.id
        LEFT JOIN song ON song.album_id = album.id
        WHERE song.title ilike 'The %'
        ORDER BY album.year DESC

    ) as a

) as b
ORDER BY row
LIMIT 25

However it is slow and inefficient because the innermost query has no LIMIT so it will search the entire table for all possible matches. Ideally it would stop searching when the 25 distinct artist.id's have been found.

Can this query be rewritten or optimized to execute faster?

I think window functions could speed things up here, but I haven't been able to figure out something that works.

Thanks!


Solution

  • Try this, should work faster than your current query

    SELECT
      artist.id,                        
      MAX( album.year ) as latest_album_date
    FROM 
      artist
      JOIN album ON album.artist_id = artist.id  -- JOIN, not LEFT JOIN
      JOIN song ON song.album_id = album.id      -- Since you have song.title in WHERE clause, it makes no sense to make these as a LEFT JOIN
    WHERE 
      song.title ilike 'The %'
    GROUP BY
      artist.id
    ORDER BY
      latest_album_date DESC
    limit 25;
    

    SQLFIDDLE