Search code examples
sqlsnowflake-cloud-data-platformtrimspace

Is there a special character for tab indention in snowflake?


I need to count words in string in snowflake. the problem is that between the words there are spaces, in variety count.

for example:

str = '67778        19gj5  7770 202ty524 2024       i900      3290 POC/*'

I want to get the number of words- 8.

what I tried:

array_size(split(str, ' '))

return 27 :(

and

array_size(split(str, '\t'))

return 1.

any idea please?


Solution

  • This is implementation of @NickW suggestion with some improvement:

    Replace all consecutive whitespace characters (\\s+) with single space and split it, use array_size, it will work with all whitespace: spaces, tabs, newlines, etc:

    WITH t1 AS (
    select '67778        19gj5  7770 202ty524 2024       i900      3290 POC/*' as str
        )
        
     select array_size(split(regexp_replace(str, '\\s+',' '),' ')) num_words
     from t1
    

    Result:

    NUM_WORDS
    8