I have some garbage data:
trueText = ' 23 Wolkenvelden en lokaal wat regen. In de ochtend op steeds meer plaatsen droog en 24 zon. In de avond kans op onweer,met name in Zeeland. 22 20 23 = max. temp. vandaag '
I want to delete the numbers that are between the
characters because this is useless. Sometimes there can be a number in the text so that's why I only want to delete those between the
characters.
I have tried some things myself:
trueText = re.sub('[^]+', ' ', trueText)
This deletes everything between the
characters. I think I have to use the \d
sequence but I can't seem to get the syntax right.
You can remove all digits in the match value using
trueText = re.sub('[^]+', lambda x: ''.join(c for c in x.group() if not c.isdigit()), trueText)
See the Python demo:
import re
trueText = ' 23 Wolkenvelden en lokaal wat regen. In de ochtend op steeds meer plaatsen droog en 24 zon. In de avond kans op onweer,met name in Zeeland. 22 20 23 = max. temp. vandaag '
print(re.sub('[^]+', lambda x: ''.join(c for c in x.group() if not c.isdigit()), trueText))
Output:
Wolkenvelden en lokaal wat regen. In de ochtend op steeds meer plaatsen droog en zon. In de avond kans op onweer,met name in Zeeland. = max. temp. vandaag