Search code examples
pythonpython-polars

Add business days including weekends


I'm trying to adjust a date by adding a specified number of business days but I would like to adjust for weekends. The weekend days, however, could change depending on the record. So if my data set looks like this:

┌────────────┬────────┬──────────┬──────────┐
│ DT         ┆ N_DAYS ┆ WKND1    ┆ WKND2    │
│ ---        ┆ ---    ┆ ---      ┆ ---      │
│ date       ┆ i64    ┆ str      ┆ str      │
╞════════════╪════════╪══════════╪══════════╡
│ 2025-01-02 ┆ 2      ┆ Saturday ┆ Sunday   │
│ 2025-01-09 ┆ 2      ┆ Friday   ┆ Saturday │
│ 2025-01-10 ┆ 2      ┆ Saturday ┆ null     │
│ 2025-01-15 ┆ 1      ┆ Saturday ┆ Sunday   │
└────────────┴────────┴──────────┴──────────┘

I can apply:

df = df.with_columns(pl.col('DT').dt.add_business_days(pl.col('N_DAYS')).alias('NEW_DT'))
┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT         ┆ N_DAYS ┆ WKND1    ┆ WKND2    ┆ NEW_DT     │
│ ---        ┆ ---    ┆ ---      ┆ ---      ┆ ---        │
│ date       ┆ i64    ┆ str      ┆ str      ┆ date       │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2      ┆ Saturday ┆ Sunday   ┆ 2025-01-06 │
│ 2025-01-09 ┆ 2      ┆ Friday   ┆ Saturday ┆ 2025-01-13 │
│ 2025-01-10 ┆ 2      ┆ Saturday ┆ null     ┆ 2025-01-14 │
│ 2025-01-15 ┆ 1      ┆ Saturday ┆ Sunday   ┆ 2025-01-16 │
└────────────┴────────┴──────────┴──────────┴────────────┘

However, I've been trying to generate a week_mask tuple for each of the records based on columns WKND1, WKND2 and apply it as part of my transformation so for the first record, the tuple should be:

(True, True, True, True, True, False, False)

Second Record would be:

(True, True, True, True, False, False, True)

and so on.

Based on the example below the actual response should be:

┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT         ┆ N_DAYS ┆ WKND1    ┆ WKND2    ┆ NEW_DT     │
│ ---        ┆ ---    ┆ ---      ┆ ---      ┆ ---        │
│ date       ┆ i64    ┆ str      ┆ str      ┆ date       │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2      ┆ Saturday ┆ Sunday   ┆ 2025-01-06 │
│ 2025-01-09 ┆ 2      ┆ Friday   ┆ Saturday ┆ 2025-01-14 │
│ 2025-01-10 ┆ 2      ┆ Saturday ┆ null     ┆ 2025-01-13 │
│ 2025-01-15 ┆ 1      ┆ Saturday ┆ Sunday   ┆ 2025-01-16 │
└────────────┴────────┴──────────┴──────────┴────────────┘

How can I generate the tuple based on column values and apply it dynamically?

I tried creating a new column containing a list and using something like this:

df = df.with_columns(pl.col('DT').dt.add_business_days(pl.col('N_DAYS'), week_mask=pl.col('W_MASK')).alias('NEW_DT'))

but getting:

TypeError: argument 'week_mask': 'Expr' object cannot be converted to 'Sequence'

Solution

  • week_mask supposed to be be Iterable, so it seems you can't pass expression there.

    You can iterate over different masks though:

    weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
    
    pl.concat([
        v.with_columns(
            pl.col('DT').dt.add_business_days(
                pl.col('N_DAYS'),
                week_mask=[x not in k for x in weekdays]
            ).alias('NEW_DT')
        ) for k, v in df.partition_by('WKND1','WKND2', as_dict = True).items()
    ]).sort('DT')
    
    shape: (4, 5)
    ┌────────────┬────────┬──────────┬──────────┬────────────┐
    │ DT         ┆ N_DAYS ┆ WKND1    ┆ WKND2    ┆ NEW_DT     │
    │ ---        ┆ ---    ┆ ---      ┆ ---      ┆ ---        │
    │ date       ┆ i64    ┆ str      ┆ str      ┆ date       │
    ╞════════════╪════════╪══════════╪══════════╪════════════╡
    │ 2025-01-02 ┆ 2      ┆ Saturday ┆ Sunday   ┆ 2025-01-06 │
    │ 2025-01-09 ┆ 2      ┆ Friday   ┆ Saturday ┆ 2025-01-13 │
    │ 2025-01-10 ┆ 2      ┆ Saturday ┆ null     ┆ 2025-01-13 │
    │ 2025-01-15 ┆ 1      ┆ Saturday ┆ Sunday   ┆ 2025-01-16 │
    └────────────┴────────┴──────────┴──────────┴────────────┘