This was really difficult to explain in the title of this, but here is a table that I have:
CATEGORY_ID COUNT GROUPING
1 130 H
2 54 B
3 128 C
4 70 D
5 31 E
6 25 F
7 64 A
8 59 F
9 66 B
10 62 E
11 129 C
12 52 G
13 27 A
14 102 A
15 101 C
I'm trying to write a query to get the TOP 5
CATEGORY_ID
's, first ordered by overall COUNT
, but then based on that group use the other CATEGORY_ID
's in that group regardless of their COUNT
. So, if I wanted to TOP 5
of the above based on this rule (which I probably explained somewhat poorly), my result would be:
CATEGORY_ID COUNT GROUPING
6 25 F <-- THE LOWEST COUNT OVERALL
8 59 F <-- THE NEXT LOWEST IN GROUP 'F'
13 27 A <-- THE NEXT LOWEST OVERALL
7 64 A <-- THE NEXT LOWEST IN GROUP 'A'
14 102 A <-- THE NEXT LOWEST IN GROUP 'A'
I've looked a lot here and elsewhere (tried things like RANK(), DENSE_RANK(), GROUPING SETS, and more - mostly as stabs in the dark) and have hit walls all around.
EDIT: One additional thing is that I need to break ties of COUNT
randomly. So, for example, if COUNT
is 0
for all rows, then the first group returned should be random. I have tried this by adding NEWID()
to the ORDER BY
in both answers below, but with no luck.
Thank you.
; with groups as (
select
grouping,
min(count) as group_min
from categories
group by grouping
)
select top 5 c.category_id, c.count, c.grouping
from categories c
join groups g on c.grouping = g.grouping
order by g.group_min, c.count
Edit:
To randomize in case of ties, you can add a random order to each group using row_number()
and newid()
:
; with groups as (
select
grouping,
row_number() over (order by newid()) as random,
min(count) as group_min
from categories
group by grouping
)
select top 5
c.category_id, c.count, c.grouping
from categories c
join groups g on c.grouping = g.grouping
order by g.group_min, g.random, c.count