Search code examples
sqldatabasepostgresqlwhere-clausesql-delete

How to limit number of rows in a database for each owner id


I need to create a complex SQL statement for Postgres that I can't seem to figure out, would anyone be able to help me out here? Here's what I need:

I have a table called tags, which is a table that stores a name and value, along with information such as the user id that owns the tag and the id of the tag itself. For the context of this problem, here are the only columns you need to know:

  • id: Id of the tag itself (BIGSERIAL)
  • guild_id: The id (BIGINT) of the guild the tag is in. A guild is essentially a large group of users in one giant room, and each tag is only associated with one group.
  • owner_id: The id (BIGINT) of the user that owns the tag.
  • created_at: The UTC timestamp (TIMESTAMP) of the time the tag was created.

I need to allow a moderator to delete all excess tags to a specific limit from each user in a guild, prioritizing older tags first. For example, let's say this is my table:

 id |      guild_id      |      owner_id      |         created_at
----+--------------------+--------------------+----------------------------
 14 | 715039810976219206 | 322896727071784960 | 2022-03-30 21:47:36.81417
 17 | 715039810976219206 | 600855080962490396 | 2022-03-31 00:57:18.024558
 20 | 715039810976219206 | 322896727071784960 | 2022-03-31 01:03:57.586239
 21 | 715039810976219206 | 322896727071784960 | 2022-03-31 01:35:13.426953
 22 | 715039810976219206 | 322896727071784960 | 2022-04-14 02:02:46.852122
 25 | 715039810976219206 | 322896727071784960 | 2022-04-14 02:03:13.559069
 26 | 239865610974543234 | 322896727071784960 | 2022-04-14 02:03:18.885876

When a moderator tries to do this function with a limit of 3 and a guild of 715039810976219206, this is what it will do:

  • For each user in the owner_id column, it will determine whether or not they own 3 or more tags within a specific guild. In this example, user 322896727071784960 owns more than 3 tags in guild 715039810976219206.
  • It will then proceed to delete all tags until the tags until there are only 3 left. It is important that it prioritizes the deletion of older tags first. In this case, it'll delete tags 14 and 20. This will result in only 3 tags left for user 322896727071784960 in guild 715039810976219206.

I've already attempted this for a while, this is as far that I've gotten if it helps:

DELETE FROM tags
WHERE id IN (
    SELECT id FROM tags
    WHERE guild_id = $1 AND owner_id = ???? ORDER BY created_at DESC
    OFFSET $3
);

(The ???? is there because I completely lost what I was doing)

Hopefully, I've explained it clearly enough for you to understand. Any help would be appreciated, and an explanation of the answer would be helpful, if possible.


Solution

  • If I understand your question correctly, you could use a rank() function and group them by the owner_id, and order by the partition date. then delete all the records that have a rank higher than 3.

    DELETE FROM  tags WHERE id in (
      SELECT id FROM 
      (
        SELECT id, 
               owner_id,
               rank() over (partition by owner_id order by created_at desc) as rnk
        FROM tags
      ) as t
      WHERE t.rnk > 3
     );
     
     SELECT * FROM tags
    

    db fiddle link