This code works except with this fileName :
Terkel in Trouble 2004
it should return 'null' instead the match returns 'e 200' becouse of :
e|x|episode|Ep|^
and
(\d{2,3})
How can I prevent that ?
def getEpisode(filename):
match = re.search(
r'''(?ix)
(?:
e|x|episode|Ep|^
)
\s*
(\d{2,3})
''', filename)
if match:
print (match)
return match.group(1)
**EDIT:**
test = (
"0x01 GdG LO Star Lord Part 1", #1
"S01E01 GdG Verso Nowhere", #2
"Wacky Races Episode 20 X264 Ac3", #3
"Terkel in Trouble 2004", #4 return None, it's ok
"Yu Yu Hakusho Ep 100 secret", #5
"Kakegurui S1 Ep11 La donna che scommette", #6
"Kakegurui S1 Ep12 La donna che gioca", #7
"ep 01 wolf's rain", #8
"Toradora! 08" #9
)
try using Word Boundaries \b
regex updated
\b(?:e(?:p(?:isode)?)?|0x|S\d\dE)?\s*?(\d{2,3})\b
results
1 -> 0x01
2 -> S01E01
3 -> Episode 20
4 ->
5 -> Ep 100
6 -> Ep11
7 -> Ep12
8 -> ep 01
9 -> 08