Search code examples
sqlpostgresqljoingroup-bywindow-functions

Count by condition of multiple inner joins in PostgreSQL


I have the following tables:

User
UserPostView
Post

UserPostView is a join table containing additional information about whether the User upvoted, downvoted, or passed after viewing the post.

Post has a column postable_type indicating the type of post (TextPost, ImagePost, etc).

I would like to calculate the upvotes, downvotes, and passed counts for each user grouped by the postable_type.

My current query is very slow and I'm pretty sure it can easily be optimized.

SELECT
    U.id,
    count((UP.postable_type = 'text_post' AND UPV.passed = true) OR NULL) as text_posts_pass_count,
    count((UP.postable_type = 'text_post' AND UPV.upvote = true) OR NULL) as text_posts_upvote_count,
    count((UP.postable_type = 'text_post' AND UPV.downvote = true) OR NULL) as text_posts_downvote_count,
    count((UP.postable_type = 'image_post' AND UPV.passed = true) OR NULL) as image_posts_pass_count,
    count((UP.postable_type = 'image_post' AND UPV.upvote = true) OR NULL) as image_posts_upvote_count,
    count((UP.postable_type = 'image_post' AND UPV.downvote = true) OR NULL) as image_posts_downvote_count
FROM
    users U
    INNER JOIN(
        SELECT
            user_id,
            post_id,
            passed,
            upvoted,
            downvoted
        FROM 
            user_post_views
    ) UPV on U.id :: TEXT = UPV.user_id :: TEXT
    INNER JOIN(
        SELECT
            id,
            postable_type
        FROM
            posts
    ) UP on UPV.post_id :: TEXT = UP.id :: TEXT
GROUP BY
    U.id

Solution

  • Don't do type conversions for joins! I think you just need:

    SELECT UPV.user_id,
           COUNT(*) FILTER (WHERE p.postable_type = 'text_post' AND upv.passed) as text_posts_pass_count,
           COUNT(*) FILTER (WHERE p.postable_type = 'text_post' AND upv.upvote) as text_posts_upvote_count,
           COUNT(*) FILTER (WHERE p.postable_type = 'text_post' AND upv.downvote ) as text_posts_downvote_count,
           COUNT(*) FILTER (WHERE p.postable_type = 'image_post' AND upv.passed) as image_posts_pass_count,
           COUNT(*) FILTER (WHERE p.postable_type = 'image_post' AND upv.upvote) as image_posts_upvote_count,
           COUNT(*) FILTER (WHERE p.postable_type = 'image_post' AND upv.downvote) as image_posts_downvote_count
    FROM user_post_views upv JOIN
         posts p
         ON upv.post_id = p.id 
    GROUP BY upv.user_id;
    

    Changes:

    • Do not do type conversions for joins! That definitely impedes the optimizer.
    • The users table doesn't seem to be necessary.
    • Subqueries are not necessary.
    • FILTER is slightly faster than conditional aggregation. More importantly, the intention is clearer.