I'm trying to do some string cleanup and wanted to keep only the part that's before a given pattern and remove everything including the pattern and everything after. The pattern is 'hi {{user_name}}'. For example:
Input: What a good day. hi {{user_name}} how are you
Expected output: What a good day.
Input: What a good day. hi {{user_name}} let me know how I can help.
Expected output: What a good day.
Input: What a great day!
Expected output: What a great day!
I tried to do split by pattern and keep only the first part of the split result, but wasn't able to find a good way in SQL
You can try using regex_replace
with hi {{user_name}}.*
pattern (explanation at regex101):
-- sample data
with dataset(str) as(
values
('What a good day. hi {{user_name}} how are you'),
('What a good day. hi {{user_name}} let me know how I can help.'),
('What a great day!')
)
-- query
SELECT regexp_replace(str, 'hi {{user_name}}.*', '')
FROM dataset;
Output (note that this will keep the space at the end):
_col0 |
---|
What a good day. |
What a good day. |
What a great day! |