Search code examples
sqlsql-serverperformancet-sqlquery-optimization

Counting DISTINCT over multiple columns


Is there a better way of doing a query like this:

SELECT COUNT(*) 
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
      FROM DocumentOutputItems) AS internalQuery

I need to count the number of distinct items from this table but the distinct is over two columns.

My query works fine but I was wondering if I can get the final result using just one query (without using a sub-query)


Solution

  • If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.

    Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.

    I believe a distinct count of the computed column would be equivalent to your query.