I want to use a regular expression to detect and substitute some phrases. These phrases follow the same pattern but deviate at some points. All the phrases are in the same string.
For instance I have this string:
/this/is//an example of what I want /to///do
I want to catch all the words inside and including the // and substitute them with "".
To solve this, I used the following code:
import re
txt = "/this/is//an example of what i want /to///do"
re.search("/.*/",txt1, re.VERBOSE)
pattern1 = r"/.*?/\w+"
a = re.sub(pattern1,"",txt)
The result is:
' example of what i want '
which is what I want, that is, to substitute the phrases within // with "". But when I run the same pattern on the following sentence
"/this/is//an example of what i want to /do"
I get
' example of what i want to /do'
How can I use one regex and remove all the phrases and //, irrespective of the number of // in a phrase?
In your example code, you can omit this part re.search("/.*/",txt1, re.VERBOSE)
as is executes the command, but you are not doing anything with the result.
You can match 1 or more /
followed by word chars:
/+\w+
Or a bit broader match, matching one or more /
followed by all chars other than /
or a whitspace chars:
/+[^\s/]+
/+
Match 1+ occurrences of /
[^\s/]+
Match 1+ occurrences of any char except a whitespace char or /
import re
strings = [
"/this/is//an example of what I want /to///do",
"/this/is//an example of what i want to /do"
]
for txt in strings:
pattern1 = r"/+[^\s/]+"
a = re.sub(pattern1, "", txt)
print(a)
Output
example of what I want
example of what i want to