I want to find out which users have the most reputation with the least amount of posts (less than 10). But why can't I have a where
clause before this join? :
SELECT TOP 100 Users.Id, Users.DisplayName AS [Username], Users.Reputation, COUNT(Posts.Id) AS [Post Count] FROM Users
//WHERE COUNT(Posts.Id) < 10
JOIN Posts ON Posts.OwnerUserId = Users.Id
GROUP BY Users.Id, Users.DisplayName, Users.Reputation
ORDER BY Users.Reputation DESC;
The original user post count example query is at data.stackexchange.com/stackoverflow/query/503051
That is what the HAVING
clause (MS reference) is for.
You would use:
SELECT TOP 100 Users.Id, Users.DisplayName AS [Username], Users.Reputation, COUNT(Posts.Id) AS [Post Count] FROM Users
JOIN Posts ON Posts.OwnerUserId = Users.Id
GROUP BY Users.Id, Users.DisplayName, Users.Reputation
HAVING COUNT(Posts.Id) < 10
ORDER BY Users.Reputation DESC;
But here it is, leveraging a few SEDE features:
-- maxRows: How many rows to return:
-- maxPosts: Maximum number of posts a user can have:
SELECT TOP ##maxRows:INT?100##
'site://u/' + CAST(u.Id AS NVARCHAR) + '|' + u.DisplayName AS [User]
, u.Reputation
, COUNT (p.Id) AS [Post Count]
FROM Users u
LEFT JOIN Posts p ON (p.OwnerUserId = u.Id AND p.PostTypeId IN (1, 2) ) -- Q & A only
GROUP BY u.Id
, u.DisplayName
, u.Reputation
HAVING COUNT (p.Id) <= ##maxPosts:INT?10##
ORDER BY u.Reputation DESC
, [Post Count]
, u.DisplayName
You can see it live in SEDE.
I particularly like the users with higher rep that have no posts.