I have a table and two of it's columns look like this:
Name cars_owned
Adam Audi
David BMW
Steve Ford
John Audi
Adam Audi
....This continues on for ~1,000 rows
I want an output like this:
cars_owned count
Audi 3
BMW 1
Ford 1
...
The issue is rows can be repeated, meaning 'Adam' and 'Audi' might appear together 100 times but for the name 'Adam' I want it to just count as one for the Audi counts regardless of however many times it appears in the table. Additionally 'Adam' might appear with 'Audi' 20 times but also appear with 'BMW' 5 times, in this case I want 'Adam' to only get into the Audi count as that appears more times than with BMW
I was able to get a count of the distinct names (247) using this code:
SELECT COUNT(DISTINCT Name)
FROM cars_table
Now I just need to break those 247 distinct names into the correct cars_owned categories to get the counts
You want the most common value for each name and then the distribution. Use two levels of aggregation:
select car, count(*) as num_names, sum(cnt) as num_cars
from (select name, car, count(*) as cnt,
row_number() over (partition by name order by count(*) desc) as seqnum
from cars c
group by name, car
) nc
where seqnum = 1
group by car;
Note: If there is a tie for the most common car for a name, this returns one arbitrary value.