I have previously tried writing a regex, but it replaces all commas in the text with periods.
Data_preprocessing['tweet_without_stopwords'] = Data_preprocessing['tweet_without_stopwords'].apply(lambda x: re.sub(",",'.', str(x)))
How do I write a regex so that it only works for decimal notations of a number? That is, I want an expression in the text of the form: number,number
it was like this number.number
in text.
Data_preprocessing['tweet_without_stopwords'] = Data_preprocessing['tweet_without_stopwords'].apply(lambda x: re.sub("(\d*)\.(\d*)","\1,\2", str(x)))
Squares appeared :D
3.
Data_preprocessing['tweet_without_stopwords'] = Data_preprocessing['tweet_without_stopwords'].apply(lambda x: re.sub("(\d+)\,(\d+)","\1.\2", str(x)))
The regex you need is "(\d+),(\d+)"
to "\1.\2"
. Decomposition:
(\d+) at least one digit (group 1)
, a literal ,
(\d+) at least one digit (group 2)
replace
\1 group 1
. a period
\2 group 2
Applied to your code, the relevant section would be
lambda x: re.sub(r"(\d+),(\d+)",r"\1.\2", str(x))