I'm trying to exclude "numbers" and the symbols "-" and "_" from a string that I got parsing a URL.
For example,
string1 = 'historical-fiction_4'
string_cleaned = re.sub("[^a-z]", "", string1)
print(string1)
print(string_cleaned)
historical-fiction_4
historicalfiction
With re.sub("[^a-z]")
I got just the strings from a to z but instead of getting the string "historicalfiction" I would like to get "Historical Fiction".
More or less all my data is collected with this structure "name1-name2_number".
If anyone can help me improve my re.sub() call I'll really appreciate. Thanks a lot!
You can use str.title()
to capitalize every word:
import re
string1 = "historical-fiction_4"
string1 = re.sub(r"[^a-z]", " ", string1).strip().title()
print(string1)
Prints:
Historical Fiction