Using integer in data as condition on number of rows in table

This is a representation of my table T1:

acc_id|acc_holders|bal_1|bal_2
00001|002|50|100
00001|002|50|100
00002|001|100|100
00003|003|50|100
00003|003|50|100

acc_holders indicates the number of account holders. The bal_1 figure represents the account balance split evenly between the holders. Bal_2 represents the overall account balance.

The problem is that in some rows (e.g. for acc_id 00003), the acc_holders value does not match the number of times that account appears (00003 should appear 3 times).

I'd like to select these anomalies using something like:

SELECT acc_id from t1
WHERE acc_holders <> count(distinct acc_id)

This throws error "misuse of aggregate function count()"

Solution

You have the right idea:

select acc_id
from t1
group by acc_id, acc_holders
having count(*) <> acc_holders;

Or, you might want:

select acc_id
from t1
group by acc_id, acc_holders
having count(*) <> min(acc_holders) or
       min(acc_holders) <> max(acc_holders);

You have two potential problems. One is that acc_holders might not represent the number of account rows. The second is that acc_holders might vary among rows for a given account. This latter version gets both these situations.