Basically, I want to remove the whitespace that exists after numbers
Input:
medication_title
CLORIDRATO DE VENLAFAXINA 75 MG
VIIBRYD 40 MG
KTRIZ UNO 0.6 U/G
Ouput:
medication_title medication_title2
CLORIDRATO DE VENLAFAXINA 75 MG CLORIDRATO DE VENLAFAXINA 75MG
VIIBRYD 40 MG VIIBRYD 40MG
KTRIZ UNO 0.6 U/G KTRIZ UNO 0.6U/G
Ideas?
We can use a regex replacement here:
SELECT
medication_title,
REGEXP_REPLACE(medication_title,
'\y(\d+(?:\.\d+)?) ([^[:space:]]*G)\y',
'\1\2') AS medication_title2
FROM yourTable;
Here is an explanation of the regex pattern:
\y
word boundary(
match and capture in \1
\d+
a number(?:\.\d+)?
followed by optional decimal component)
close capture group \1
match a space (the one we want to remove)(
match and capture in \2
[^[:space:]]*
zero or more leading non whitespace charactersG
folllwed by "G")
close capture group \2
\y
another word boundary