I am building a lookup table in bigquery
dealing with U.S States. In one of the columns key
I have a bunch a values such as: code
, provider
, referral
, state
, phone_call
, video_call
.
The values from the key correspond with the value_type
field and the value
field.
I need to make the values from the key
field into a column, and assign the value from the value
field to it. On top of that, I only need to get the provider
, referral
, phone_call
, and video_call
. I am currently doing a case statement
, but that feels a bit clunky.
The sample code looks something like this:
id|name |code|state_id|key |label |value_type|value |
--+----------+----+--------+-------------------------------------+---------------------------------+----------+----------+
52|Porto Rico|PR | 52|phone_call |Phone Call |boolean |0 |
52|Porto Rico|PR | 52|video_call |Video Call |boolean |0 |
52|Porto Rico|PR | 52|code |Code |text |PR |
52|Porto Rico|PR | 52|referral |Referral |select |ASYNC |
52|Porto Rico|PR | 52|provider |Provider |boolean |false |
52|Porto Rico|PR | 52|state |State |text |Porto Rico|
My code currently looks like this:
SELECT
state_id
, name AS state
, code
, CASE
WHEN key = "provider"
THEN value
END AS provider
, CASE
WHEN key = "referral"
THEN value
END AS referral
, CASE
WHEN key = "phone_call"
THEN value
END AS phone_call
, CASE
WHEN key = "video_call"
THEN value
END AS video_call
FROM {{ ref('enrich_states')}}
Is there a better way to handle this in DBT
rather than using case statements? I feel that it's a bit clunky but because I only need specific values from the key
I can't think of another way to make this cleaner. Could I get some help on this, please?
I believe you have two options.
-- FIRST PART
{% set keys_query %}
SELECT DISTINCT key FROM {{ ref('enrich_states') }}
{% endset %}
{% set results = run_query(keys_query) %}
{% if execute %}
{% set results_list = results.columns[0].values() %}
{% else %}
{% set results_list = [] %}
{% endif %}
-- SECOND PART
SELECT * FROM {{ ref('enrich_states') }}
PIVOT(MAX(value) FOR key IN (
{% for key in results_list %}
'{{ key }}'
{% if not loop.last %},{% endif %}
{% endfor %}
))
run_query
. That's for the case if you don't want to predefine a list of keys (see details here). If you know them upfront, just replace the first part with {% set results_list = ['phone_call', 'video_call','code'] %}
run_query
to work you need to add dbt_utils
to your packages.yml
and run dbt deps
packages:
- package: dbt-labs/dbt_utils
version: 1.1.1
{% set keys_query %}
SELECT DISTINCT key FROM {{ ref('enrich_states') }}
{% endset %}
{% set results = run_query(keys_query) %}
{% if execute %}
{% set results_list = results.columns[0].values() %}
{% else %}
{% set results_list = [] %}
{% endif %}
SELECT
state_id
, name AS state
{% for key in results_list %}
, CASE
WHEN key = '{{ key }}'
THEN value
END AS {{ key }}
{% endfor %}
FROM {{ ref('enrich_states') }}
I haven't tested it in the actual environment, so let me know if you face any bugs