Search code examples
sqloracle-databaseaggregate-functions

How to do count(distinct) for multiple columns


This does not work:

select count(distinct colA, colB) from mytable

I know I can simply solve this by making a double select.

select count(*) from (
    select distinct colA, colB from mytable
)

Is there anyway I can do this without having to do the sub-select?


Solution

  • [TL;DR] Just use a sub-query.


    If you are trying to use concatenation then you need to ensure that you delimit the terms with a string that is never going to appear in the values otherwise you will find non-distinct terms grouped together.

    For example: if you have a two numeric column then using COUNT(DISTINCT col1 || col2) will group together 1||23 and 12||3 and count them as one group.

    You could use COUNT(DISTINCT col1 || '-' || col2) but if the columns are string values and you have 'ab-'||'-'||'c' and 'ab'||'-'||'-c' then, once again, they would be identical once concatenated.

    The simplest method is to use a sub-query.

    If you can't do that then you can combine columns via string-concatenation but you need to analyse the contents of the column and pick a delimiter that does not appear in your strings otherwise your results might be erroneous. Even better is to ensure that the delimiter character will never be in the sub-string with check constraints.

    ALTER TABLE mytable ADD CONSTRAINT mytable__col1__chk CHECK (col1 NOT LIKE '%¬%');
    ALTER TABLE mytable ADD CONSTRAINT mytable__col2__chk CHECK (col2 NOT LIKE '%¬%');
    

    Then:

    SELECT COUNT(DISTINCT col1 || '¬' || col2)
    FROM   mytable;