I need to count words in string in snowflake. the problem is that between the words there are spaces, in variety count.
for example:
str = '67778 19gj5 7770 202ty524 2024 i900 3290 POC/*'
I want to get the number of words- 8.
what I tried:
array_size(split(str, ' '))
return 27 :(
and
array_size(split(str, '\t'))
return 1.
any idea please?
This is implementation of @NickW suggestion with some improvement:
Replace all consecutive whitespace characters (\\s+
) with single space and split it, use array_size, it will work with all whitespace: spaces, tabs, newlines, etc:
WITH t1 AS (
select '67778 19gj5 7770 202ty524 2024 i900 3290 POC/*' as str
)
select array_size(split(regexp_replace(str, '\\s+',' '),' ')) num_words
from t1
Result:
NUM_WORDS
8