I'm doing a project, on Google Colab, where I use the following version:
!pip install "gensim==4.2.0" !pip install "texthero==1.0.5"
Until recently, I received the following warning: FutureWarning: The default value of regex will change from True to False in a future version. return input.str.replace(r"^\d+\s|\s\d+\s|\s\d+$", " ")
But the execution worked normally. Now, I'm getting the following error:
How should I proceed?
I tried different versions, but the problem persists.
This is a texthero bug triggering a pandas error.
Pandas str.replace
now uses regex=False
by default:
Texthero's replace_digits
function hasn't been updated in two years and doesn't explicitly pass regex=True
:
if only_blocks:
pattern = r"\b\d+\b"
return s.str.replace(pattern, symbols)
else:
return s.str.replace(r"\d+", symbols)
You should fill a bug report to texthero, there are probably several other occurrences of str.replace
to fix.
In there meantime you can patch the library by changing the code to:
if only_blocks:
pattern = r"\b\d+\b"
return s.str.replace(pattern, symbols, regex=True)
else:
return s.str.replace(r"\d+", symbols, regex=True)
Or use a pandas version prior to 2
(e.g. 1.5.2
)