Search code examples
sqlsnowflake-cloud-data-platformansi-sql

how to remove duplicates in the table based on one column in SQL


I do have a data like

employee(id, transaction, sales)

id                  transaction             sales
1100000000002493644 46414104026581005688    2.2400
1100000000003120288 46403360812820035967    2.2400
1100000000002493644 46414104026581005688    2.2400
3300000000002592894 46427351011280903808    2.2400
1100000000108772072 46498465013410865589    3.9800
1100000000002493644 46498465013410865909    3.2900

As you can the above table has 3 columns. I'm trying to remove duplicates based on transaction data. I tried the below code, but it's not working

SELECT employee.*, DISTINCT(transaction) AS UNIQUE_TXN
FROM employee AS employee
INNER employee AS employee_1
    ON employee.transaction= employee_1.transaction
GROUP BY employee.*

But it's giving the wrong output. The expected output is below

id                  transaction             sales
1100000000002493644 46414104026581005688    2.2400
1100000000003120288 46403360812820035967    2.2400
3300000000002592894 46427351011280903808    2.2400
1100000000108772072 46498465013410865589    3.9800
1100000000002493644 46498465013410865909    3.2900

Can anyone help me with this?


Solution

  • To find duplicates the following code could be used:

    SELECT *, COUNT(*) OVER(PARTITION BY transaction) AS num_per_transaction
    FROM employee
    QUALIFY num_per_transaction > 1;