Search code examples
postgresqlcountsqlalchemydistinctaggregate-functions

SqlAlchemy: count of distinct over multiple columns


I can't do:

>>> session.query(
        func.count(distinct(Hit.ip_address, Hit.user_agent)).first()
TypeError: distinct() takes exactly 1 argument (2 given)

I can do:

session.query(
        func.count(distinct(func.concat(Hit.ip_address, Hit.user_agent))).first()

Which is fine (count of unique users in a 'pageload' db table).

This isn't correct in the general case, e.g. will give a count of 1 instead of 2 for the following table:

 col_a | col_b
----------------
  xx   |  yy
  xxy  |  y

Is there any way to generate the following SQL (which is valid in postgresql at least)?

SELECT count(distinct (col_a, col_b)) FROM my_table;

Solution

  • The exact query can be produced using the tuple_() construct:

    session.query(
        func.count(distinct(tuple_(Hit.ip_address, Hit.user_agent)))).scalar()