I tried to compose a query on data.stackexchange.com which would give me the answer to the question what the percentage of users who have written posts which were viewed over 1000 times and who possess badges of a certain class.
Problem: if I write AND ViewCount > 1000
I get the same result as WHERE ViewCount > 1000
I don't understand why it happens
The query with WHERE:
SELECT
TOP 1000
Class, (Count(Badges.UserId)* 100 / (SELECT COUNT(*) From Badges)) AS Percentage
FROM Badges
WHERE Badges.UserId IN (SELECT Posts.OwnerUserId FROM Posts
INNER JOIN
Badges ON Posts.OwnerUserId = Badges.UserId
WHERE ViewCount > 1000)
GROUP BY Class
ORDER BY Class
The query with AND:
SELECT
TOP 1000
Class, (Count(Badges.UserId)* 100 / (SELECT COUNT(*) From Badges)) AS Percentage
FROM Badges
WHERE Badges.UserId IN (SELECT Posts.OwnerUserId FROM Posts
INNER JOIN
Badges ON Posts.OwnerUserId = Badges.UserId
AND ViewCount > 1000)
GROUP BY Class
ORDER BY Class
Output (for some reason it is the same in both cases):
1 class - 2 percent
2 class - 15 percent
3 class - 45 percent
You are using the AND
to connect the two expressions in the ON
condition of the JOIN
. Focus on the inner query:
SELECT Posts.OwnerUserId FROM Posts
INNER JOIN Badges ON Posts.OwnerUserId = Badges.UserId
WHERE ViewCount > 1000
Versus:
SELECT Posts.OwnerUserId FROM Posts
INNER JOIN Badges
ON (
Posts.OwnerUserId = Badges.UserId
AND ViewCount > 1000
)
These two queries are logically equivalent.