Search code examples
mysqlsqliteforeign-key-relationship

SQL select by column of newest element of has_many relationship


In my project I have a tasks(id, name) table. I also have a status_updates(id, task_id, user_id, status, created_at) table.

I want to select tasks whose newest status_update equals a certain status.

I cannot figure out how to combine the following things, however:

  1. Only use the newest status_update for each task.
  2. Filter away the status_updates that do not have the correct status.
  3. Wrap this in a INNER JOIN so the final result is a collection of tasks.

When using GROUP_BY, it seems that 1. and 2. cancel each other out. An example:

SELECT * FROM status_updates GROUP BY task_id ORDER created_at DESC

gives back only the newest status_update for each task. However, this:

SELECT * FROM status_updates WHERE status='1' GROUP BY task_id ORDER created_at DESC

executes the WHERE before the ORDER, thus giving back from the ones with a correct status, the newest, instead of from the newest, only those with a correct status.

I also tried using HAVING, but as it only works on aggregate columns I could not figure out how to use it properly, or if it was at all helpful in this scenario.

I am using Ruby on Rails, so I'd like an answer to work both on MySQL as well as SQLite.


Solution

  • Using a subquery:

    SELECT
        T.id,
        T.name,
        SU.status
    FROM
        Tasks T
    INNER JOIN (SELECT task_id, MAX(created_at) AS max_created_at FROM Status_Updates GROUP BY task_id) SQ ON SQ.task_id = T.id
    INNER JOIN Status_Updates SU ON
        SU.task_id = SQ.task_id AND
        SU.created_at = SQ.max_created_at
    WHERE
        SU.status = '1'
    

    Using window functions and a CTE (although not currently supported by your RDBMS, they are by many and could be in the future). This one has the advantage that it handles ties better:

    WITH MyCTE AS
    (
        SELECT
            T.id,
            T.name,
            SU.status,
            ROW_NUMBER OVER(PARTITION BY T.id ORDER BY SU.created_at DESC, id DESC) AS row_num
        FROM
            Tasks T
        INNER JOIN Status_Updates SU ON SU.task_id = T.id
    )
    SELECT
        id,
        name,
        status
    FROM
        MyCTE
    WHERE
        row_num = 1