Search code examples
looker-studio

How can I re-categorize any counts < n in Google Data Studio?


I'm creating a dashboard that shows some demographic information for a population. However, in cases where less than 25 individuals show a certain demographic trait, I need to be able to re-categorize that trait as "REDACTED" to protect privacy.

A pseudo-code version might be:

CASE
  WHEN COUNT(gender) < 25 THEN "REDACTED"
  ELSE gender
END

Unfortunately, this mixes aggregation and non-aggregation and isn't workable in a calculated field.

I'm hoping for some suggestions on how to solve preferably without aggregating first in the data source --I'd like to be for my data and blends to exist at the individual grain with aggregation happening within Google Data Studio.

Publicly-Editable Example Google Data Studio report

Data source (INPUT) is the source, also available here.

Expected output:

gender count
female 222
male 223
REDACTED 55

Solution

  • 0) Summary

    The below looks at:

    1. Issue: "calculated fields can't mix metrics (aggregated values) and dimensions (non-aggregated values)"
    2. Suggestion: Reaggregating data using a blend

    1) Issue

    A system error is displayed:

    Sorry, calculated fields can't mix metrics (aggregated values) and dimensions (non-aggregated values). Please check the aggregation types of the fields used in this formula. Learn more.

    when using the calculated field in the question:

    CASE
      WHEN COUNT(gender) < 25 THEN "REDACTED"
      ELSE gender
    END
    

    This ("aggregated values") refers to the 2nd line in the CASE, specifically the aggregated field:

    COUNT(gender)

    2) Suggestion

    One way is to use reaggregation with a blend, by performing a self blend (where the data source of both tables are the same) and creating a dedicated COUNT(gender) field (titled gender_count below):

    2.1) Blend Fields

    Table 1 Table 2
    Dimension #1: gender Dimension #1: gender
    Dimension #2: person_id
    Dimension #3: dept
    Metric: Record Count Metric: gender
    Aggregation: COUNT
    Rename: gender_count
    table_1 table_2

    2.2) Join Configuration

    Join Description Table 1 🔗 Table 2
    Join Operator Left Outer
    Join Condition gender 🔗 gender
    Image join

    In summary, the result of the blend will look like:

    blend_summary

    2.3) Pie Chart

    • Dimension:
      CASE
        WHEN gender_count < 25 THEN "REDACTED"
        ELSE gender
      END
      
    • Metric: Record Count

    Publicly editable Google Data Studio report (embedded Google Sheets data source) and a GIF to elaborate:

    gif