Search code examples
sqldbt

Is there a better way of loop through a column for specific values?


I am building a lookup table in bigquery dealing with U.S States. In one of the columns key I have a bunch a values such as: code, provider, referral, state , phone_call, video_call.

The values from the key correspond with the value_type field and the value field. I need to make the values from the key field into a column, and assign the value from the value field to it. On top of that, I only need to get the provider, referral, phone_call, and video_call. I am currently doing a case statement, but that feels a bit clunky.

The sample code looks something like this:

id|name      |code|state_id|key                                  |label                            |value_type|value     |
--+----------+----+--------+-------------------------------------+---------------------------------+----------+----------+
52|Porto Rico|PR  |      52|phone_call                           |Phone Call                       |boolean   |0         |
52|Porto Rico|PR  |      52|video_call                           |Video Call                       |boolean   |0         |
52|Porto Rico|PR  |      52|code                                 |Code                             |text      |PR        |
52|Porto Rico|PR  |      52|referral                             |Referral                         |select    |ASYNC     |
52|Porto Rico|PR  |      52|provider                             |Provider                         |boolean   |false     |
52|Porto Rico|PR  |      52|state                                |State                            |text      |Porto Rico|

My code currently looks like this:

SELECT 
    state_id
    , name      AS state
    , code
    , CASE
        WHEN key = "provider"
        THEN value 
    END         AS provider
    , CASE 
        WHEN key = "referral"
        THEN value 
    END         AS referral
    , CASE 
        WHEN key = "phone_call"
        THEN value 
    END         AS phone_call
    , CASE 
        WHEN key = "video_call"
        THEN value 
    END         AS video_call
FROM {{ ref('enrich_states')}}

Is there a better way to handle this in DBT rather than using case statements? I feel that it's a bit clunky but because I only need specific values from the key I can't think of another way to make this cleaner. Could I get some help on this, please?


Solution

  • I believe you have two options.

    1. Use Pivot functionality (available in BigQuery, but normally in other databases as well):
    -- FIRST PART
    {% set keys_query %}
    SELECT DISTINCT key FROM {{ ref('enrich_states') }}
    {% endset %}
    
    {% set results = run_query(keys_query) %}
    
    {% if execute %}
    {% set results_list = results.columns[0].values() %}
    {% else %}
    {% set results_list = [] %}
    {% endif %}
    
    -- SECOND PART
    SELECT * FROM {{ ref('enrich_states') }}
    PIVOT(MAX(value) FOR key IN (
      {% for key in results_list %}
        '{{ key }}'
        {% if not loop.last %},{% endif %}
      {% endfor %}
    ))
    
    • The first part contains run_query. That's for the case if you don't want to predefine a list of keys (see details here). If you know them upfront, just replace the first part with {% set results_list = ['phone_call', 'video_call','code'] %}
    • For run_query to work you need to add dbt_utilsto your packages.yml and run dbt deps
    packages:
      - package: dbt-labs/dbt_utils
        version: 1.1.1
    
    • The second part contains PIVOT operator invocation and for loop. See details on loops in dbt here
    1. Alternatively, you can simply add for loop to your initial code:
    {% set keys_query %}
    SELECT DISTINCT key FROM {{ ref('enrich_states') }}
    {% endset %}
    
    {% set results = run_query(keys_query) %}
    
    {% if execute %}
    {% set results_list = results.columns[0].values() %}
    {% else %}
    {% set results_list = [] %}
    {% endif %}
    
    SELECT 
      state_id
      , name AS state
      {% for key in results_list %}
        , CASE
            WHEN key = '{{ key }}'
            THEN value
          END AS {{ key }}
      {% endfor %}
    FROM {{ ref('enrich_states') }}
    

    I haven't tested it in the actual environment, so let me know if you face any bugs