Search code examples
sqlitegreatest-n-per-group

How to Average the most recent X entries with GROUP BY


I've looked at many answers on SO concerning situations related to this but I must not be understanding them too well as I didn't manage to get anything to work.

I have a table with the following columns: timestamp (PK), type (STRING), val (INT)

I need to get the most recent 20 entries from each type and average the val column. I also need the COUNT() as there may be fewer than 20 rows for some of the types.

I can do the following if I want to get the average of ALL rows for each type:

SELECT type, COUNT(success), AVG(success)
  FROM user_data
  GROUP BY type

But I want to limit each group COUNT() to 20.

From here I tried the following:

SELECT type, (
  SELECT AVG(success) AS ave
  FROM (
    SELECT success
      FROM user_data AS ud2
      WHERE umd2.timestamp = umd.timestamp
      ORDER BY umd2.timestamp DESC
      LIMIT 20
    )
  ) AS ave
  FROM user_data AS ud
  GROUP BY type

But the returned average is not correct. The values it returns are as if the statement is only returning the average of a single row for each group (it doesn't change regardless of the LIMIT).


Solution

  • Using sqlite, you may consider the row_number function in a subquery to acquire/filter the most recent entries before determining the average and count.

    SELECT
        type,
        AVG(val),
        COUNT(1)
    FROM (
        SELECT 
            *,
            ROW_NUMBER() OVER (
               PARTITION BY type
               ORDER BY timestamp DESC
            ) rn
        FROM
            user_data
    ) t
    WHERE rn <=20
    GROUP BY type