Search code examples
sqlexcelcombinationspermutation

Count occurrences of each combination


I have 5 conditions (A-E) and a bunch of patient ID's. My data set is 2 columns: PatientID, Condition.

There are duplicate PatientID's with every new condition:

PatientID Condition
456 C
456 E
279 D
123 A
123 C
123 D
187 D
296 E
296 C

I believe there are 31 different potential combinations (order doesn't matter) of those 5 conditions (ie A, AB, ABC, AC, ACDE, etc)

I want to count how many patients in each combination of conditions. So my results for the above would be - CE: 2 D: 2 ACD: 1

I'm more familiar with Excel but if this is better handled in SQL, I can do it there. I think I need to create a table of all the different combinations (any help on that would be appreciated too) and then do a count from there but I'm not sure if that's the best way.


Solution

  • SQL Server solution

    SELECT
        conditions,
        COUNT(*)
    FROM (
        SELECT
            patientid,
            STRING_AGG(condition, '') WITHIN GROUP (ORDER BY condition) conditions
        FROM tbl
        GROUP BY patientid
    ) c
    GROUP BY conditions 
    

    Output

    conditions patient_count
    ACD 1
    CE 2
    D 2

    db<>fiddle here