Search code examples
pythonmatchcasestructural-pattern-matching

Conditional cases in match statement python3.10 (structural pattern matching)


I'm currently developing something and was wondering if the new match statement in python 3.10 would be suited for such a use case, where I have conditional statements.

As input I have a timestamp and a dataframe with dates and values. The goal is to loop over all rows and add the value to the corresponding bin bases on the date. Here, in which bin the value is placed depends on the date in relation with the timestamp. A date within 1 month of the timestamp is place in bin 1 and within 2 months in bin 2 etc...

The code that I have now is as follows:

bins = [0] * 7

for date, value in zip(df.iloc[:,0],df.iloc[:,1]):
    match [date,value]:
        case [date,value] if date < timestamp + pd.Timedelta(1,'m'):
            bins[0] += value
        case [date,value] if date > timestamp + pd.Timedelta(1,'m') and date < timestamp + pd.Timedelta(2,'m'):
            bins[1] += value
        case [date,value] if date > timestamp + pd.Timedelta(2,'m') and date < timestamp + pd.Timedelta(3,'m'):
            bins[2] += value
        case [date,value] if date > timestamp + pd.Timedelta(3,'m') and date < timestamp + pd.Timedelta(4,'m'):
            bins[3] += value
        case [date,value] if date > timestamp + pd.Timedelta(4,'m') and date < timestamp + pd.Timedelta(5,'m'):
            bins[4] += value
        case [date,value] if date > timestamp + pd.Timedelta(5,'m') and date < timestamp + pd.Timedelta(6,'m'):
            bins[5] += value

Correction: originally I stated that this code does not work. It turns out that it actually does. However, I am still wondering if this would be an appropriate use of the match statement.


Solution

  • I'd say it's not a good use of structural pattern matching because there is no actual structure. You are checking values of the single object, so if/elif chain is a much better, more readable and natural choice.

    I've got 2 more issues with the way you wrote it -

    1. you do not consider values that are on the edges of the bins
    2. You are checking same condition twice, even though if you reached some check in match/case you are guaranteed that the previous ones were not matched - so you do not need to do if date > timestamp + pd.Timedelta(1,'m') and... if previous check of if date < timestamp + pd.Timedelta(1,'m') failed you already know that it is not smaller. (There is an edge case of equality but it should be handled somehow anyway)

    All in all I think this would be the cleaner solution:

    for date, value in zip(df.iloc[:,0],df.iloc[:,1]):
    
        if date < timestamp + pd.Timedelta(1,'m'):
            bins[0] += value
        elif date < timestamp + pd.Timedelta(2,'m'):
            bins[1] += value
        elif date < timestamp + pd.Timedelta(3,'m'):
            bins[2] += value
        elif date < timestamp + pd.Timedelta(4,'m'):
            bins[3] += value
        elif date < timestamp + pd.Timedelta(5,'m'):
            bins[4] += value
        elif date < timestamp + pd.Timedelta(6,'m'):
            bins[5] += value
        else:
            pass