Search code examples
postgresql

How can I filter for ids with more than one corresponding unique value in a certain column?


I have the following postgres SQL table with three columns;

    id      linktype      to_id    
  322754       4            1
  322754       4            1
  322754       4            2
  322798       4            2
  322798       4            2
  322797       4            3
  322791       4            6
  322790       4            3

What I am trying to do is filter for to_id values where there are more than one corresponding id value that is different.

So from the above I would simply want to_id value 3 returned, since it is the only to_id to have more than one DISTINCT id corresponding to it.


Solution

  • Use a having clause
    demo at db<>fiddle

    select to_id
    from test
    group by to_id
    having count(distinct id)>1;
    
    to_id
    2
    3

    To make that faster, use exists()

    select distinct to_id
    from test a
    where exists(
      select from test b
      where a.id<>b.id
        and a.to_id=b.to_id);
    

    Or a self-join:

    select distinct a.to_id
    from test a
    join test b 
      on a.id<>b.id
      and a.to_id=b.to_id;
    

    There are other methods using except/intersect, an in/not in, but exists() is usually quicker.

    Don't forget about the index:

    create index on test (to_id,id);
    

    It speeds things up for all 3 methods above, significantly. Also note that the optimal method and index depends on the characteristics of your data set: on 100k records where only 807 to_id's match your criteria, the last two examples win: demo1. If out of that 100k records I make 28k match that, the first example becomes faster: demo2.

    Make sure to run your own benchmarks and adjust your approach.