Search code examples
postgresqlsphinx

how to select only unique values from related tables


I have three tables:

 id | name
----+------
  1 | a
  2 | b
  3 | c
  4 | d

branch_1
 id | parent_id | b1_name
----+-----------+---------
  1 |         1 | aaa
  2 |         1 | bbb
  3 |         1 | ccc
  4 |         2 | ddd
  5 |         2 | eee
  6 |         2 | fff
  7 |         2 | ggg
  8 |         2 | hhh

branch_2
 id | b2_name | parent_id
----+---------+-----------
  1 | b2_aa   |         1
  2 | b2_ba   |         2
  3 | b2_bb   |         2
  4 | b2_bc   |         2

I want to combine these tables with only unique values. I tried this code:

select p.id, b1.b1_name, b2.b2_name
from parent p
left outer join lateral (select distinct b1.b1_name from branch_1 b1 where b1.parent_id=p.id order by b1.b1_name) as b1 on true
left outer join lateral (select distinct b2.b2_name from branch_2 b2 where b2.parent_id=p.id order by b2.b2_name) as b2 on true

but in result has duplicate values:

 id | b1_name | b2_name
----+---------+---------
  1 | aaa     | b2_aa
  1 | bbb     | b2_aa
  1 | ccc     | b2_aa
  2 | ddd     | b2_ba
  2 | ddd     | b2_bb
  2 | ddd     | b2_bc
  2 | eee     | b2_ba
  2 | eee     | b2_bb
  2 | eee     | b2_bc
  2 | fff     | b2_ba
  2 | fff     | b2_bb
  2 | fff     | b2_bc
  2 | ggg     | b2_ba
  2 | ggg     | b2_bb
  2 | ggg     | b2_bc
  2 | hhh     | b2_ba
  2 | hhh     | b2_bb
  2 | hhh     | b2_bc

I want to get:

 id | b1_name | b2_name
----+---------+---------
  1 | aaa     | b2_aa
  1 | bbb     | ---
  1 | ccc     | ---
  2 | ddd     | b2_ba
  2 | eee     | b2_bb
  2 | fff     | b2_bc
  2 | ggg     | ---
  2 | hhh     | ---

maybe it seems strange, but I need only unique values without duplicates. So how I can only unique values?

P.S. I need such a table for facet search on sphinx, so that the count of products are considered correctly, with such a request:

select * from facetIndex facet b1_name, b2_name

Solution

  • You can use a window function to do the outer join only on the first rows:

    select t.id, t.b1_name, b2.b2_name
    from (
      select p.id, b1.b1_name, row_number() over (partition by p.id order by b1_name) as rn
      from parent p
        join branch_1 b1 on b1.parent_id = p.id
    ) t
      left join (
        select parent_id, b2_name, 
               row_number() over (partition by parent_id) as rn
        from branch_2 
      ) b2 on b2.parent_id = t.id and b2.rn = t.rn
    order by t.id;
    

    Online example: https://rextester.com/GNYF31027