Search code examples
sqlsqlitecountdistinct

SQL get counts of unique values


Using sqlite3, how can I get the different "count" columns for this example in-memory database? Using version 3.27.2

Example Database

CREATE TABLE events (
    id1, 
    id2, 
    id3, 
    PRIMARY KEY (id1, id2)
);

INSERT INTO events (id1, id2, id3)
VALUES 
   (1,1,99),
   (1,2,99),
   (1,3,52),
   (2,1,6),
   (2,2,7),
   (2,3,8)
;

.mode columns
.header on
SELECT * FROM events;

enter image description here

Desired Printed Output

enter image description here

Partial Success The following works for the first two new columns.

SELECT id1, count(id3) AS total_count, count(DISTINCT id3) AS unique_count
FROM events
GROUP BY id1;

What is the best way to get the last column? The following returns error: no such column: total_count

SELECT id1, count(id3) AS total_count, count(DISTINCT id3) AS unique_count, (total_count - unique_count) AS repeated_count
FROM events
GROUP BY id1;

Solution

  • If you group by id1, id3 like this:

    SELECT id1, id3, COUNT(*) counter
    FROM events
    GROUP BY id1, id3;
    

    you get the number of rows for each combination of id1, id3:

    id1 id3 counter
    1 52 1
    1 99 2
    2 6 1
    2 7 1
    2 8 1

    Now, all you have to do is:

    • sum the column counter for each id1 to get the column total_count
    • count the number of rows for each id1 to get the column unique_count
    • count the number of rows for each id1 where the column counter is > 1 to get the column repeated_id3

    You can do this with SUM() and COUNT() window functions:

    SELECT DISTINCT id1, 
           SUM(COUNT(*)) OVER (PARTITION BY id1) AS total_count, 
           COUNT(*) OVER (PARTITION BY id1) AS unique_count,
           SUM(COUNT(*) > 1) OVER (PARTITION BY id1) repeated_id3
    FROM events
    GROUP BY id1, id3;
    

    See the demo.
    Results:

    id1 total_count unique_count repeated_id3
    1 3 2 1
    2 3 3 0