Search code examples
sql-serverdataexplorer

How to get user counts for both posts and comments, by certain tags?


With this dataexplorer query:

select b.DisplayName as owner, a.N from
(select OwnerUserId, count(*) as N from Posts where (Tags like '%perl6%' or Tags like '%rakudo%' or (Tags like '%parrot%' and Tags like '%perl%')) group by OwnerUserId) as a,
(select Id, DisplayName from Users) as b
where a.OwneruserId = b.Id order by N desc;

I can list all users that have posted some question with a certain tag.
However, I'd like to also list the users that have answered or commented in posts with that tag.

I guess it involves using the Comment table, however it's not clear to me how to merge the UserIds from both tables.


Solution

  • Refer to the SEDE schema:

    1. Get the questions by tags.
    2. Better to use tests against the Tags table than using LIKE operations on the tags column. The former can be 20 times faster.
    3. Use the question list, from step 1 to get answers.
    4. Use both the question list and the answer list to get comments.
    5. Note that SEDE provides wonderful magic columns like [User Link].

    Putting it all together, here's one way:

    WITH questsByTags AS (
        SELECT DISTINCT
                    q.Id
                    , q.OwnerUserId
        FROM        Posts q
        INNER JOIN  PostTags pt     ON q.Id = pt.PostId
        INNER JOIN  Tags t          ON t.Id = pt.TagId
        WHERE       q.PostTypeId    = 1  -- questions
        AND (
                t.TagName   = 'perl6'
            OR  t.TagName   = 'rakudo'
            OR (
                t.TagName   = 'parrot'
                AND EXISTS (
                    SELECT      * FROM PostTags pt2
                    INNER JOIN  Tags t2      ON  t2.Id = pt2.TagId
                    WHERE q.Id = pt2.PostId  AND t2.TagName = 'perl'
        )   )   )
    ),
    answersByTags AS (
        SELECT      a.Id
                    , a.OwnerUserId
        FROM        Posts a
        INNER JOIN  questsByTags qbt  ON qbt.Id = a.ParentId
    ),
    commntsByTags AS (
        SELECT      c.Id
                    , c.UserId  AS [OwnerUserId]
        FROM        Comments c
        INNER JOIN (
            SELECT              Id FROM questsByTags
            UNION ALL SELECT    Id FROM answersByTags
        ) AS allPosts
        ON allPosts.Id = c.PostId
    ),
    allUsers AS (
        SELECT          OwnerUserId FROM questsByTags
        UNION SELECT    OwnerUserId FROM answersByTags
        UNION SELECT    OwnerUserId FROM commntsByTags
    )
    SELECT      au.OwnerUserId      AS [User Link]
                , (SELECT Count (qbt.Id) FROM questsByTags  qbt WHERE qbt.OwnerUserId = au.OwnerUserId)  AS [Num Qsts]
                , (SELECT Count (abt.Id) FROM answersByTags abt WHERE abt.OwnerUserId = au.OwnerUserId)  AS [Num Ans]
                , (SELECT Count (cbt.Id) FROM commntsByTags cbt WHERE cbt.OwnerUserId = au.OwnerUserId)  AS [Num Cmmnts]
    FROM        allUsers au
    WHERE       au.OwnerUserId IS NOT NULL
    ORDER BY    [Num Qsts] DESC, [Num Ans] DESC, [Num Cmmnts] DESC
    

    You can see it live, in action at this SEDE link.