Search code examples
pythonregexpython-re

Capturing groups of movie title


I am trying to capture the following groups from a movie title:

file = "The Great Home Se01E01 Meatballs for Dinner"

<show> = "The Great Home"
<season> = "Se01"
<episode> = "E01"
<title> = "Meatballs for Dinner"

For the time being, I only partially managed to capture and using the following code:

import re

file = "The Great Home Se01E01 Meatballs for Dinner"
seasonEpID = re.search(r'(\bS/?.+\d{1,2})+(E/?.+\d{1,2})', file)
print(seasonEpID.groups())

Which returns the following:

('Se01', 'E01')

How can one capture the four groups <show>, <season>, <episode>, <title>?


Solution

  • import re
    file = "The Great Home Se01E0k1 Meatballs for Dinner"
    match = re.fullmatch(r"(?P<show>.+?) (?P<season>Se\d+)(?P<episode>E\d+) (?P<title>.+)", file)
    print(match.groupdict() if match else "No match") 
    
    '''
    {
      'episode': 'E01',
      'season': 'Se01',
      'show': 'The Great Home',
      'title': 'Meatballs for Dinner'
    }
    '''