Search code examples
jinja2dbt

String manipulation in DBT - Jinja


I want to develop a macro that will loop rows from a seeding query's result and create a dynamic query for another task.

Let's assume my row would be similar to

<agate.Row: ('product_available', Decimal('0.6'), 'Positive')>
<agate.Row: ('product_quality', Decimal('0.5'), 'Negative')>

I intend to generate an array selection for downstream queries which have my_udf that will take arguments from my seeding query's result. For example,

my_udf("product_available", 0.6, 'Positive')
my_udf("product_quality", 0.5, 'Negative')

The problem that I have is some arguments are actual column names, while others are values. Hence, column names should have double quotes, while value must have single quote. For instance, "product_available" vs 'Positive'

My code is

{% macro generate_list_select_from(seeding_query) %}
    {# ... other code to execute my seeding query ... #}

    {# loop query goes here  #}
    {% for i in results_list %}
      
      {% set item = "my_udf( {{ i[0] }} , {{ i[1] }}, '{{ i[2] }}' )" %}

      {{items.append(item)}}

    {% endfor %}


    {{ return(items) }}
{% endmacro %}

Below is output when I use my macro

select foo_column,  
       my_udf( {{ i[0] }} , {{ i[1] }}, '{{ i[2] }}' ),
       my_udf( {{ i[0] }} , {{ i[1] }}, '{{ i[2] }}' )
from foo_table

My question is how to create a such string?

Update:

  • Tried other way, {% set item = "my_udf(" + {{i[0]}} + ")" %}, I end up with a compilation error expected token ':', got '}'

Solution

  • Don't nest your curlies.

    ~ is the string concatenation operator in jinja. You could use that to build up your function args, including the single quotes:

    {% set item = "my_udf(" ~ i[0] ~ ", " ~ i[1] ~ ", '" ~ i[2] ~ "' )" %}
    

    This is pretty hard to read, though. I'd probably add a macro called quoted:

    {% macro quoted(s) %}
    '{{ s }}'
    {% endmacro %}
    

    And then use the join filter to concatenate the items of a list:

    {% set args = [i[0], i[1], quoted(i[2])] %}
    {% set item = "my_udf(" ~ args | join(", ") ~ ")" %}
    

    Lastly, if the purpose of this macro is to template SQL, then you don't need to use return(items). You should just have the body of the macro template the string you want (see the quoted macro above). That dramatically simplifies things:

    {% macro quoted(s) %}
    '{{ s }}'
    {% endmacro %}
    
    {% macro generate_list_select_from(seeding_query) %}
        {# ... other code to execute my seeding query ... #}
    
        {# loop query goes here  #}
        {% for i in results_list %}
          {% set args = [i[0], i[1], quoted(i[2])] %}
          my_udf({{ args | join(", "}}){% if not loop.last %},{% endif %}
        {% endfor %}
    
    {% endmacro %}