Search code examples
sqldistinctpartition

Choose which distinct value to take based on another column - PostgresSQL


I have a table of this kind:

name tag
inter cep
inter a
inter b
milan a
milan c
juve r
juve cep
naples h
naples p

And I need to take only the distinct names, in a way that if the tag = cep take that one, otherwise take a random one (maybe the first one, but not important)

so the output should be something like:

name tag
inter cep
milan a
juve cep
naples h

I tried different ways using row_number() like:

select name, tag
from (
   select *,
          row_number() over (partition by name order by name) as row_number
   from table
   ) as rows
where tag='cep' or row_number = 1

but this doesn't eliminate some of the name duplicates, especially those where "cep" is present but is not the first one in the partition (like name = juve for instance)

SOLUTION?


Solution

  • Use a case expression in the partition order by to get cep rows first:

    select name, tag
    from (
       select *,
              row_number() over (partition by name 
                                 order by case when tag = 'cep' then 0 else 1 end) as row_number
       from table
       ) as rows
    where row_number = 1
    

    Demo at https://dbfiddle.uk/6r-JDu90