Search code examples
sqldatabaseunicodeasciisnowflake-schema

How to find invisible character in Snowflake


I am working on pulling data from a dataset where two columns, old_value and new_value should NOT be equal. But I am getting false rows being fetched.

For Example:

old_value new_value
ABC ABC

This row should not be fetched but it was there. And upon figuring out why, it showed below:

old_value new_value len(old_value) len(new_value)
ABC ABC 4 3

I am not sure why it showed "4" for old_value. I did try replacing white space with blank, new line character, regexp_replace, but none worked.

Please suggest how to find and remove this invisible character. Thank you!


Solution

  • you can use regular expresiions to remove non ascii charters

    select regexp_replace(old_value, '[^[:ascii:]]', '')
    

    the pattern you must adapt to your data