I need to find and replace any space between digits in a long string using regular expression.
I have tried to use regular expression such as [0-9][\s][0-9]
and then regexp_replace such as .withColumn('free_text', regexp_replace('free_text', '[0-9][\s][0-9]', ''))
.
However, the regex matches 1(space)4
where I would like to have only (space)
Here is an example:
What I have:
"Hello. I am Marie. My number is 768 990"
What I would like to have:
"Hello. I am Marie. My number is 768990"
Thanks,
Here is one way to do this, using capture groups:
.withColumn('free_text', regexp_replace('free_text', '([0-9])\s([0-9])', '$1$2'))
The idea here is to match and capture the two digits separated by a whitespace character in between them. Then, we can replace by just the two digits adjacent.