Search code examples
sqlpostgresqlduplicatessql-deletepostgresql-9.5

How to delete duplicated data by column form database


I have a table that has multiple duplicated rows by one column. So I want to clear duplicated datas.

Events Table:

event_id,   event_type,     data_id,    data    date
    1       insert              1       x       06.03.2019 13:04
    2       update              1       x1      06.03.2019 13:05
    3       update              1       x11     06.03.2019 13:06
    4       insert              2       y       06.03.2019 13:07
    5       update              1       x111    06.03.2019 13:08
    6       delete              1       x111    06.03.2019 13:09        
    7       update              2       y1      06.03.2019 13:10
    8       update              2       y11     06.03.2019 13:11
    9       update              2       y11     06.03.2019 13:12

Each data id has 1 insert, N update and 1 delete event rows in table. So I want to delete N-1 update events but last event will not be deleted. For example in this table, for data_id=1 update events are 2,3,5. I want to delete 2 and 3 but not 5. Because 5 is last update.


Solution

  • With EXISTS:

    delete from tablename t
    where
      event_type = 'update'
      and exists (
        select 1 from tablename 
        where 
          data_id = t.data_id 
          and
          event_type = 'update'
          and 
          event_id > t.event_id
      )