SQL/Big Query: identify counts to meet two or more conditions

I am new to SQL and writing queries on Google Big Query. I have a question regards whether I can select counts of id to meet one or more conditions, two or more conditions, etc.

For a simple case, say I have a group of people with different disease conditions. If I want to select how many of them have one or more conditions I can go below (where diease_X is a flag and 0 means not having the disease for this person 1 represents have).

select count(distinct ids)
from disease_table
where disease_A=1 or disease_B=1 or disease_C=1

If I want to find the number of people have all diseases I can change or to and in my where conditions. But how do I count the number of people have 2 or more conditions? I think it is possible to solve it by listing all possibilities of disease, or using the total count to subtract number of people with only 1 or 0 disease.

However, I would like to know if there is a simple way for doing this since I have 2 follow up questions:

1. My data table has 20+ diseases to work on, so it will make a long list in the where clause if I need to list them all.
1. I am also curious to find out how to know the counts for three or more conditions, four or more conditions...

Any thought on this would be super helpful. I am currently writing a query in Google Big Query, so hopefully there is a magic clause I can use to make it happen.

Solution

UNPIVOT might help solve your problem:

Reshape your table using UNPIVOT first.

CREATE TEMP TABLE disease_table AS
SELECT 'a' AS ids, 1 AS disease_A, 1 AS disease_B, 1 AS disease_C
 UNION ALL
SELECT 'b' AS ids, 1 AS disease_A, 0 AS disease_B, 1 AS disease_C
 UNION ALL
SELECT 'c' AS ids, 1 AS disease_A, 1 AS disease_B, 0 AS disease_C
;

SELECT * 
  FROM disease_table UNPIVOT (has_disease FOR disease_name IN (disease_A, disease_B, disease_C))
;

Original table will turn into one like below.

Count the number of diseases of each people.

SELECT ids, SUM(has_disease) number_of_disease 
  FROM disease_table UNPIVOT (has_disease FOR disease_name IN (disease_A, disease_B, disease_C))
 GROUP BY ids 
;

output:

For finding out the people having over n diseases, HAVING clause would be helpful.

SELECT ids, SUM(has_disease) number_of_disease 
  FROM disease_table UNPIVOT (has_disease FOR disease_name IN (disease_A, disease_B, disease_C))
 GROUP BY ids 
HAVING number_of_disease > 2