Search code examples
sqlmysql

Get records that have total count of over X amount


I have a VIEW that looks like this:

| email  | list  | sent_last_30_days |
--------------------------------------
| email1 | list1 |       1           |
| email2 | list1 |       0           |
| email1 | list2 |       1           |
| email3 | list3 |       1           |
| email1 | list3 |       1           |
| email2 | list2 |       0           |
| email1 | list4 |       0           |
| email2 | list3 |       1           |
--------------------------------------

If you take note of email1, it is found on 4 lists but only active on 3 (as indicated by the 1 or 0 in the sent_last_30_days column).

email2 is found on 3 lists, but only active on 1.

I need to write a query that will return the records that are active on 3 or more lists.

Using the example above, email1 will be the only records that will return.

So the results of said query will look something like this:

| email  | list  | sent_last_30_days |
--------------------------------------
| email1 | list1 |       1           |
| email1 | list2 |       1           |
| email1 | list3 |       1           |
| email1 | list4 |       0           |
--------------------------------------

Because email1 has been active on at least 3 lists.

email2 and email3 did not meet this requirement because they weren't active on at least 3 lists.

MySQL version: 5.7.44


Solution

  • Well, let's find email which are active in at least 3 distinct lists (e.g. list1, list2, list3 for email1):

      select mv.email                         
        from MyView mv
       where mv.sent_last_30_days <> 0       -- only active
    group by mv.email                        
      having count(distinct mv.list) >= 3    -- having at least 3 distinct lists
    

    having this subquery we can add it into where of the main query:

      select *
        from MyView
       where email in (select mv.email
                         from MyView mv
                        where mv.sent_last_30_days <> 0
                     group by mv.email
                       having count(distinct mv.list) >= 3)
    

    Drop distinct if an email should appear in any 3 lists, doesn't necessary distinct (e.g. list1, list1, list2)