Search code examples
sqlitefilterappearance

Filter for appearance of 2 values that must at least exist 1 times


Title may be bad, couldn't think of a better one.

My comment data, each comment is assigned to an account by usernameChannelId:

usernameChannelId | hasTopic | sentiment_sum | commentId
a                 | 1        | 4             | xyxe24
a                 | 0        | 2             | h5hssd
a                 | 1        | 3             | k785hg
a                 | 0        | 2             | j7kgbf
b                 | 1        | -2            | 76hjf2
c                 | 0        | -1            | 3gqash
c                 | 1        | 2             | ptkfja
c                 | 0        | -2            | gbe5gs
c                 | 1        | 1             | hghggd

My code:

SELECT u.usernameChannelId, avg(sentiment_sum) sentiment_sum, u.hasTopic
FROM total_comments u
WHERE u.hasTopic is True
GROUP BY u.usernameChannelId
HAVING count(u.usernameChannelId) > 0

UNION

SELECT u.usernameChannelId, avg(sentiment_sum) sentiment_sum, u.hasTopic
FROM total_comments u
WHERE u.hasTopic is False
GROUP BY u.usernameChannelId

I want to get all usernameChannelIds that have at least 1 comment with hasTopic == 0 and 1 comment with hasTopic == 1 (to compare both groups statistically and remove user that only commented in topic or offtopic videos).

How can I filter like that?


Solution

  • Here's a little trick that may help. First, you need to get familiar with the CASE expression., here's an excerpt from the doc.

    The CASE expression

    A CASE expression serves a role similar to IF-THEN-ELSE in other programming languages.

    The optional expression that occurs in between the CASE keyword and the first WHEN keyword is called the "base" expression. There are two basic forms of the CASE expression: those with a base expression and those without.

    An expression like CASE when hasTopic is False then 1 else 0 END will evaluate to 1 if hasTopic is 0. An expression for hasTopic is True would be similar.

    Now, those CASEs can be summed, which will tell you if user has any rows with hasTopic True and hasTopic False.

    Something like this in the having clause might do the trick (one for each value of course)

    HAVING SUM(CASE when hasTopic is False then 1 else 0 END) > 0

    (it would be necessary to remove the WHERE clause, and the UNION query would be unnecessary).