SQL Server 2017.
I have 2 tables, with hacker name and id, and another coding challenges submitted by each (below). I need to output id, name and count of challenges, filtering out those hackers who submitted the same number of challenges, except when this number is a maximum.
Here is sample data and final output i need
Hackers:
hacker_id name
1 john
2 tom
3 anna
4 mary
5 steve
Challenges:
challenge_id hacker_id
1 1
2 1
3 1
4 2
5 2
6 2
7 2
8 3
9 3
10 3
11 4
12 4
13 4
14 4
15 5
16 5
this is the count of challenges per person (from here, we see that max number is 4 per person) :
hacker_id name count of challenges
1 john 3
2 tom 4
3 anna 3
4 mary 4
5 steve 2
The final output would be as follows:
hacker_id name count of challenges
2 tom 4
4 mary 4
5 steve 2
i.e. tom and mary both submitted 4 challenges. They are included because although the number 4 repeats, it's a maximum John and Anna submitted 3 each. They are excluded because 3 is not a max per person. Steve submitted 2 and this number is unique, so he's included, too.
Here is my code:
SELECT h.hacker_id,
h.name,
COUNT(c.challenge_id) AS ChalCountPerHead
FROM hackers h
JOIN challenges c ON h.hacker_id = c.hacker_id
LEFT JOIN (
SELECT d.FreqHacker, COUNT(d.FreqHacker) as FreqOfFreq FROM
(SELECT hacker_id, COUNT(challenge_id) AS FreqHacker
FROM Challenges GROUP BY hacker_id) d
GROUP BY d.FreqHacker
) dd
ON FreqHacker = COUNT(c.challenge_id)
GROUP BY h.hacker_id, h.name
HAVING
COUNT(c.challenge_id) = (SELECT MAX(d.FreqHacker) from d)
OR
dd.FreqOfFreq = 1
It doesn't work, stating an error message on this line
ON FreqHacker = COUNT(c.challenge_id)
An aggregate cannot appear in an ON clause unless it is in a sub query contained in a HAVING clause or select list, and the column being aggregated is an outer reference.
Here is one way to do it.
Having sample data in the question make it way easier to verify the solution. Please include it next time.
CTE
is a simple aggregation to get the number of challenges submitted by each hacker.
In CTE2
MAX
gives the global maximum frequency. HackerCountOfSameFreq
is the count of hackers that have the same frequency.
Final WHERE
removes groups of hackers that consist of more than 1 hacker, but leaves the group that has the maximum frequency.
Sample data
DECLARE @Hackers TABLE (hacker_id int, name varchar(50));
INSERT INTO @Hackers VALUES
(1, 'john'),
(2, 'tom'),
(3, 'anna'),
(4, 'mary'),
(5, 'steve');
DECLARE @Challenges TABLE (challenge_id int, hacker_id int);
INSERT INTO @Challenges VALUES
(1 , 1),
(2 , 1),
(3 , 1),
(4 , 2),
(5 , 2),
(6 , 2),
(7 , 2),
(8 , 3),
(9 , 3),
(10, 3),
(11, 4),
(12, 4),
(13, 4),
(14, 4),
(15, 5),
(16, 5);
Query
WITH
CTE
AS
(
SELECT hacker_id, COUNT(*) AS FreqHacker
FROM @Challenges
GROUP BY hacker_id
)
,CTE2
AS
(
SELECT
hacker_id
,FreqHacker
,COUNT(*) OVER (PARTITION BY FreqHacker) AS HackerCountOfSameFreq
,MAX(FreqHacker) OVER () AS GlobalMaxFreq
FROM CTE
)
SELECT
CTE2.hacker_id
,CTE2.FreqHacker
,H.Name
FROM
CTE2
INNER JOIN @Hackers AS H ON H.hacker_id = CTE2.hacker_id
WHERE
HackerCountOfSameFreq = 1
OR FreqHacker = GlobalMaxFreq
ORDER BY
CTE2.hacker_id
;
Result
+-----------+------------+-------+
| hacker_id | FreqHacker | Name |
+-----------+------------+-------+
| 2 | 4 | tom |
+-----------+------------+-------+
| 4 | 4 | mary |
+-----------+------------+-------+
| 5 | 2 | steve |
+-----------+------------+-------+
Your query also produces correct result (at least with this sample data), once the syntax is fixed.
I've split it into CTEs, leaving most of your logic as is:
WITH
d
AS
(
SELECT hacker_id, COUNT(challenge_id) AS FreqHacker
FROM @Challenges
GROUP BY hacker_id
)
,dd
AS
(
SELECT d.FreqHacker, COUNT(d.FreqHacker) as FreqOfFreq
FROM d
GROUP BY d.FreqHacker
)
,d3
AS
(
SELECT
h.hacker_id,
h.name,
COUNT(c.challenge_id) AS ChalCountPerHead
FROM
@hackers h
JOIN @challenges c ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name
)
,d4
AS
(
SELECT *
FROM
d3
LEFT JOIN dd ON dd.FreqHacker = ChalCountPerHead
)
SELECT *
FROM d4
WHERE
ChalCountPerHead = (SELECT MAX(d.FreqHacker) from d)
OR FreqOfFreq = 1
ORDER BY hacker_id
;