Search code examples
sqlsqlitedatetimegreatest-n-per-groupwindow-functions

Sqlite Query: Five Most Recent Episode of Each Podcast


My application stores hundreds of podcast episodes in a SQLite table. I am looking for query that returns the five most recent episodes for each podcast (podcasts are stored in a different table).

The relevant table columns are

  • media_id = primary key (unique identifier for an episode)
  • publication_date = can be used find the most recent episodes
  • episode_remote_podcast_feed_location = defines the relation to a podcast

Here is what I tried:

SELECT *
FROM episodes a
WHERE a.media_id IN (
    SELECT b.media_id
    FROM episodes b 
    WHERE a.episode_remote_podcast_feed_location = b.episode_remote_podcast_feed_location
    ORDER BY b.episode_remote_podcast_feed_location, b.publication_date DESC
    LIMIT 5
) 
ORDER BY a.episode_remote_podcast_feed_location, a.publication_date

The above query always returns five episodes (in total) - no matter how many podcasts I have. At least those episodes are all belonging to the same podcast. (Episodes have the same value for episode_remote_podcast_feed_location.)

Needed solution: I need a query that returns five episodes (the most recent ones) for each podcast. So if the application has only a single podcast stored, return five episodes. If the application has three podcasts stored, return 15 episodes: five episodes for podcast b, five episodes for podcast c.

Additional information: I need to able to use the query in an Android application. The version of SQLite bundled with Android is not the most current. Some SQLite features, for example Window Functions, are not supported on Android (see also: https://stackoverflow.com/a/55069907/14326132).


Solution

  • You can use row_number() in a subquery to enumerate the episodes of each podcast, then filter on the top 5 per podcast in the outer query:

    select *
    from (
        select e.*, 
            row_number() over(
                partition by episode_remote_podcast_feed_location 
                order by  publication_date desc
            ) rn
        from episodes e
    ) e
    where rn <= 5
    

    In versions of SQLite that do not support window functions, an alternative is a correlated subquery:

    select *
    from episodes e
    where (
        select count(*) 
        from episodes e1 
        where 
            e1.episode_remote_podcast_feed_location = e.episode_remote_podcast_feed_location 
            and e1.publication_date  >= e.publication_date 
    ) <= 5