Search code examples
sqlgoogle-bigquerycaseany

SQL: Creating a column depending on whether there is at least one value in another column that fulfils a certain condition or not


I want to know whether a marketing campaign is active or not, and the only way to know it is checking if at least one of the ads is still active. So I have this table:

+-------------+--------+--------------+
| campaign_id | ad_id  |   ad_status  |
+-------------+--------+--------------+
|  campaign1  |  ad1   |   INACTIVE   |
+-------------+--------+--------------+
|  campaign1  |  ad2   |   ACTIVE     |
+-------------+--------+--------------+
|  campaign2  |  ad3   |   INACTIVE   |
+-------------+--------+--------------+
|  campaign2  |  ad4   |   INACTIVE   |
+-------------+--------+--------------+

The desired outcome would be like this:

+-------------+--------+--------------+-----------------+
| campaign_id | ad_id  |   ad_status  | campaign_status |
+-------------+--------+--------------+-----------------+
|  campaign1  |  ad1   |   INACTIVE   |      ACTIVE     |
+-------------+--------+--------------+-----------------+
|  campaign1  |  ad2   |   ACTIVE     |      ACTIVE     |      
+-------------+--------+--------------+-----------------+
|  campaign2  |  ad3   |   INACTIVE   |     INACTIVE    |
+-------------+--------+--------------+-----------------+
|  campaign2  |  ad4   |   INACTIVE   |     INACTIVE    |
+-------------+--------+--------------+-----------------+

I've been trying combining CASE WHEN and WHERE ANY inside another SELECT clause but I don't seem to get how to get it done, also I don't want to have it GROUP BY, just want that column with the repeated values.

Also I'm using BigQuery in case there is a specific BigQuery way to do this.


Solution

  • Try this by using UNNEST():

    with sample_data as (
      select 'campaign1' as campaign_id, 'ad1' as ad_id, 'INACTIVE' as ad_status,
      union all select 'campaign1' as campaign_id, 'ad2' as ad_id, 'ACTIVE' as ad_status,
      union all select 'campaign2' as campaign_id, 'ad3' as ad_id, 'INACTIVE' as ad_status,
      union all select 'campaign2' as campaign_id, 'ad4' as ad_id, 'INACTIVE' as ad_status,
    )
    
      SELECT *, if ('ACTIVE' IN UNNEST(array_agg(ad_status) over (partition by campaign_id)), 'ACTIVE', 'INACTIVE' ) AS contains_value
    
    from sample_data
    

    Here is the ouptut: enter image description here