Search code examples
pythonregexmultiple-matches

Storing Python RegEx multiple groups


I'm webscraping a site using python. The returned results have the following format, ( https://regex101.com/r/irr14u/10 ), where everything works ok apart from the last occassion where i get 2 matches for the dates (1st match:Thur.-Sun., Tue., Wed. and second match: Mon.)

I'm using the following code to get the values that i want. I use BeautifoulSoup to get movieDate string, but here i hardcoded it.

movieDate="Thur.-Sun., Tue., Wed.: 20.50/ 23.00, Mon. 23.00"

weekDays=re.match(',? *(?P<weekDays>[^\d:\n]+):? *(?P<startTime>[^,\n]+)', movieDate).groupdict()['weekDays']
startTime=re.match(',? *(?P<weekDays>[^\d:\n]+):? *(?P<startTime>[^,\n]+)', movieDate).groupdict()['startTime']

I want to create a dictionary as following (it has two keys because the are two startTime values); The first key will be Thur.-Sun., Tue., Wed. with value =20.50/ 23.00 and the second key will be Mon. with value=23:00. There might be occassions with one or more than two keys. So the dictionary will be as following;

dictionary={ Thur.-Sun., Tue., Wed.: 20.50/ 23.00, Mon.: 23.00}

Any suggestions to achieve that in a non boggy way?


Solution

  • You can achieve the desired output using finditer function, appending result of the captured groups to a dict dynamically.

    Python snippet:

    import re
    movieDate = """
    Thur.-Sun., Tue., Wed.: 20.50/ 23.00, Mon. 23.00
    """
    
    d = dict();
    r = re.compile(',? *(?P<weekDays>[^\d:\n]+):? *(?P<startTime>[^,\n]+)')
    for m in r.finditer(movieDate):
        d[m.group(1)] = m.group(2)
    
    print(d)
    

    Prints:

    {'Thur.-Sun., Tue., Wed.': '20.50/ 23.00', 'Mon. ': '23.00'}