Search code examples
sqlapache-sparkcountapache-spark-sqldistinct

best way to get count and distinct count of rows in single query


What is the best way to get count of rows and distinct rows in a single query?

To get distinct count we can use subquery like this:

select count(*) from
(
   select distinct * from table
)

I have 15+ columns and have many duplicates rows as well and I want to calculate count of rows as well as distinct count of rows in one query.

More if I use this

select  count(*) as Rowcount , count(distinct *) as DistinctCount from table

This will not give accurate results as count(distinct *) doesn't work.


Solution

  • Why don't you just put the subquery inside another query?

    select count(*),
           (select count(*) from (select distinct * from table))
    from table;