Search code examples
dbt

DBT transform that escapes equal sign at beginning of varchar


In order to avoid CSV injection (e.g. = at start of field) - still happens in CSV if opened in Excel

Is there a way in DBT to apply a transformation to each and every column.


Solution

  • I found a solution base on my research. This is perfectly fit base on my requirements as describe @tconbeer.

    1. I have data in a sql database, and some cells contain strings that start with =
    2. I am using dbt to model data in our database
    3. I want to COPY or export that data in our database to a CSV
    4. I want to open that CSV in Excel

    This DBT macro will remove all special characters that are possibly use in CSV injection attacks like '=','@','-','+'. See https://owasp.org/www-community/attacks/CSV_Injection

    Here is my solution:

        {% macro sanitize_csv(value_to_sanitize) -%}
            {{ 
                dbt_utils.replace(
                    dbt_utils.replace(
                        dbt_utils.replace(
                            dbt_utils.replace(value_to_sanitize,"'='", "''"),
                             "'+'", "''"),
                         "'-'", "''"), 
                    "'@'", "''") 
            }}
        {%- endmacro %}
    

    For me this is a bit cleaner than using case when like suggested by @tconbeer. I don't know if its faster though but for me its good solution for now.