Search code examples
filtergroup-byduplicatessnowflake-cloud-data-platformhaving

How to select rows from a table based on duplicated values in a column Snowflake


I have a table A that looks similar to:

ID PET COUNTRY
45 DOG US
72 DOG CA
15 CAT CA
36 CAT US
37 CAT SG
12 SNAKE IN
20 PIG US
14 PIG RS
33 HORSE IQ

(has about a few hundred rows)

I would like to retain the rows that have a duplicated "PET" value, so the result looks like:

|ID|PET |COUNTRY
|--| --- |---|
|45| DOG |US|
|72 |DOG|CA|
|15 |CAT |CA|
|36 |CAT|US|
|37 |CAT|SG|
|20|PIG|US|
|14|PIG|RS|

How can I remove the rows that do not have duplicated PET values? Would it be something like

SELECT ID, PET, COUNTRY, COUNT(*)
FROM A 
GROUP BY PET, COUNTRY, ID
HAVING COUNT(*) >1

I am not sure how to group the values by PET and pick out the groups only containing one row. Thanks!


Solution

  • What about simply doing:

    WITH 
    RES AS (SELECT PET, COUNT(*) FROM A GROUP BY PET HAVING COUNT(*) > 1)
    SELECT ID, PET, COUNTRY FROM A WHERE PET IN (SELECT PET FROM RES);
    

    This would give you all rows with pets present in more than one row.