Search code examples
mysqlsqlsubquerylimitsql-delete

MySQL delete statement based on sub-select with multiple return values while retaining 500 records for each unique value


MySQL delete statement based on sub-select with multiple return values.

Here is what I am doing now:

DELETE FROM `dnsstats` WHERE id NOT IN
(
 SELECT id FROM
    (
    SELECT id FROM `dnsstats` WHERE peerhost = 'x.x.x.243' ORDER BY id DESC LIMIT 500
    ) foo 
)
 AND id NOT IN
(
 SELECT id FROM
    (
    SELECT id FROM `dnsstats` WHERE peerhost = 'x.x.x.40' ORDER BY id DESC LIMIT 500
  ) foo2
)
 AND id NOT IN
(
 SELECT id FROM
    (
    SELECT id FROM `dnsstats` WHERE peerhost = 'x.x.x.50' ORDER BY id DESC LIMIT 500
  ) foo3
);

This works great, but I would like to select the individual IP addresses dynamically like so:

SELECT peerhost FROM `dnsstats` GROUP BY peerhost;

..and delete based on those returned values.

This is what I got from my previous question:

DELETE FROM `dnsstats` WHERE id NOT IN
(
 SELECT id FROM
    (
    SELECT id FROM `dnsstats` WHERE peerhost in (
            SELECT peerhost FROM `dnsstats` GROUP BY peerhost
        ) ORDER BY id DESC LIMIT 500
    ) foo 
);

..but this leaves me with 500 total records. I would like to keep 500 records of each peerhost

Any idea what I can do to make this work? Thanks!

Original question: MySQL delete statement based on sub-select with multiple return values


Solution

  • I would do this using variables. The following enumerates each peerhost:

    select s.*,
           (@rn := if(@ph = peerhost, @rn + 1,
                      if(@ph := peerhost, 1, 1)
                     )
           ) as rn
    from dnstats s cross join
         (select @ph := '', @rn := 0) params
    order by peerhost, id desc;
    

    You can use this in a delete using a subquery:

    delete s
        from dnstats s join
             (select s.*,
                     (@rn := if(@ph = peerhost, @rn + 1,
                                if(@ph := peerhost, 1, 1)
                               )
                     ) as rn
              from dnstats s cross join
                   (select @ph := '', @rn := 0) params
              order by peerhost, id desc
             ) es
             on es.id = s.id
        where rn > 500;
    

    This deletes all but the first 500 rows for each peerhost.