Search code examples
sqldatabaseruby-on-rails-3duplicatespostgresql-9.2

Find and delete duplicate rows with PostgreSQL


We have a table of photos with the following columns:

id, merchant_id, url 

this table contains duplicate values for the combination merchant_id, url. so it's possible that one row appears more several times.

234 some_merchant  http://www.some-image-url.com/abscde1213
235 some_merchant  http://www.some-image-url.com/abscde1213
236 some_merchant  http://www.some-image-url.com/abscde1213

What is the best way to delete those duplications? (I use PostgreSQL 9.2 and Rails 3.)


Solution

  • Here is my take on it.

    select * from (
      SELECT id,
      ROW_NUMBER() OVER(PARTITION BY merchant_Id, url ORDER BY id asc) AS Row
      FROM Photos
    ) dups
    where 
    dups.Row > 1
    

    Feel free to play with the order by to tailor the records you want to delete to your specification.

    SQL Fiddle => http://sqlfiddle.com/#!15/d6941/1/0


    SQL Fiddle for Postgres 9.2 is no longer supported; updating SQL Fiddle to postgres 9.3