Search code examples
sqlpostgresqljoinpostgresql-9.6postgresql-performance

Best way to get distinct count from a query joining two tables (multiple join possibilities)


I have 2 tables, table Actions & table Users. Actions -> Users is many-one association.

Table Actions (has thousands of rows)

  • id
  • uuid
  • name
  • type
  • created_by
  • org_id

Table Users (has a max of hundred rows)

  • id
  • username
  • org_id
  • org_name

I am trying to get the best join query to obtain a count with a WHERE clause. I need the count of distinct created_bys from table Actions with an org_name in Table Users that contains 'myorg'. Also, (Actions.created_by = Users.username)

I currently have the below queries (producing expected results) and wonder which is better and if it can be optimized further?

Query 1:

select count(distinct Actions.created_by)
from Actions join
     Users
     on Users.org_id = Actions.org_id 
where Users.org_name like '%myorg%';

Query 2:

select count(distinct Users.username)
from Users join
     Actions
     on Actions.created_by = Users.username 
where Users.org_name like '%myorg%';

Solution

  • The fastest approach is to modify the second query:

    select count(*)
    from Users u
    where exists (select 1
                  from Actions a
                  where a.created_by = u.username 
                 )
    and u.org_name like '%myorg%';
    

    Then the best index is on actions(created_by).