I'm fairly new to python (and this community), this is a question branching off of a question asked and answered from a long time ago from here
With a list like:
['hello', '...', 'h3.a', 'ds4,']
Creating a new list x with no punctuation (and deleting empty elements) would be:
x = [''.join(c for c in s if c not in string.punctuation) for s in x]
x = [s for s in x if s]
print(x)
Output:
['hello', 'h3a', 'ds4']
However, how would I be able to remove all punctuation only from the beginning and end of each element? I mean, to instead output this:
['hello', 'h3.a', 'ds4']
In this case, keeping the period in the h3a but removing the comma at the end of the ds4.
You could use regular expressions. re.sub()
can replace all matches of a regex with a string.
import re
X = ['hello', '.abcd.efg.', 'h3.a', 'ds4,']
X_rep = [re.sub(r"(^[^\w]+)|([^\w]+$)", "", x) for x in X]
print(X_rep)
# Output: ['hello', 'abcd.efg', 'h3.a', 'ds4']
Explanation of regex: Try it
(^[^\w]+)
:
^
: Beginning of string[^\w]+
: One or more non-word characters|
: The previous expression, or the next expression([^\w]+$)
:
[^\w]+
: One or more non-word characters$
: End of string