Search code examples
sqlgoogle-bigquerycaselooker-studio

How to use a CASE expression to match multiples rows in Google Data Studio (SQL/Big Query data set)


I have a BigQuery table that contains a column which has a list of 'categories' (delimited by " / "):

  {
    "ID": "452",
    "Location": "France",
    "Calories": "400",
    "Categories": "/ brown / nutty / salty /"
  },
  {
    "ID": "288",
    "Location": "UK",
    "Calories": "800",
    "Categories": "/ brown / roasted / nutty /"
  },

I connect to my table using the following SQL query within Google Data Studio:

#standardSQL
SELECT
ID,
Location,
Calories,
Categories
FROM table_name;

I am trying to group rows in the table by 'Categories'.

I should then be able to filter the rows by a category and the metrics can be averaged. In the example above, if I filtered by 'Nutty', the average 'Calories' would be '600'.

The CASE expression I have tried only allows me to match the last 'Category' within the Category column, rather than matching to all rows that include the string:

CASE
WHEN REGEXP_MATCH(Categories, '.*nutty.*') THEN 'Nutty'
WHEN REGEXP_MATCH(Categories, '.*brown.*') THEN 'Brown'
WHEN REGEXP_MATCH(Categories, '.*salty.*') THEN 'Salty'
END

I believe this is an aggregation issue, but is there any way to allow each row to be matched within a 'Category' dimension multiple times in this way?

Any help much appreciated, thank you.


Solution

  • Using CASE in your use-case is not practical - instead you should use REGEXP_EXTRACT_ALL() or SPLIT() along with UNNEST() as below example shows (BigQuery Standard SQL)

     #standardSQL
    WITH `project.dataset.your_table` AS (
      SELECT "452" ID,"France" Location,400 Calories,"/ brown / nutty / salty /" Categories
        UNION ALL
      SELECT "288","UK",800,"/ brown / roasted / nutty /"    
    )
    SELECT ID, Location, Calories, Category
    FROM `project.dataset.your_table`, 
    UNNEST(REGEXP_EXTRACT_ALL(Categories, r' (\w+) ')) Category  
    

    this will give you below result

    Row ID  Location    Calories    Category     
    1   452 France      400         brown    
    2   452 France      400         nutty    
    3   452 France      400         salty    
    4   288 UK          800         brown    
    5   288 UK          800         roasted  
    6   288 UK          800         nutty    
    

    Now, you can GROUP BY whatever logic you want and all Categories will be accounted appropriatelly