For example, I have a string:
sentence = ['cracked $300 million','she\'s resolutely, smitten ', 'that\'s creative [r]', 'the market ( knowledge check : prices up!']
I want to remove the punctuation and replace numbers with the '£' symbol. I have tried this but can only replace one or the other when I try to run them both. my code is below
import re
s =([re.sub(r'[!":$()[]\',]',' ', word) for word in sentence])
s= [([re.sub(r'\d+','£', word) for word in s])]
s)
I think the problem could be in the square brackets?? thank you!
If you want to replace some specific punctuation symbols with a space and any digit chunks with a £
sign, you can use
import re
rx = re.compile(r'''[][!":$()',]|(\d+)''')
sentence = ['cracked $300 million','she\'s resolutely, smitten ', 'that\'s creative [r]', 'the market ( knowledge check : prices up!']
s = [rx.sub(lambda x: '£' if x.group(1) else ' ', word) for word in sentence]
print(s) # => ['cracked £ million', 'she s resolutely smitten ', 'that s creative r ', 'the market knowledge check prices up ']
See the Python demo.
Note where []
are inside a character class: when ]
is at the start, it does not need to be escaped and [
does not have to be escaped at all inside character classes. I also used a triple-quoted string literal, so you can use "
and '
as is without extra escaping.
So, here, [][!":$()',]|(\d+)
matches ]
, [
, !
, "
, :
, $
, (
, )
, '
or ,
or matches and captures into Group 1 one or more digits. If Group 1 matched, the replacement is the euro sign, else, it is a space.