I'm trying to adjust a date by adding a specified number of business days but I would like to adjust for weekends. The weekend days, however, could change depending on the record. So if my data set looks like this:
┌────────────┬────────┬──────────┬──────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 │
│ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str │
╞════════════╪════════╪══════════╪══════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday │
└────────────┴────────┴──────────┴──────────┘
I can apply:
df = df.with_columns(pl.col('DT').dt.add_business_days(pl.col('N_DAYS')).alias('NEW_DT'))
┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 ┆ NEW_DT │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str ┆ date │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday ┆ 2025-01-06 │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday ┆ 2025-01-13 │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null ┆ 2025-01-14 │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday ┆ 2025-01-16 │
└────────────┴────────┴──────────┴──────────┴────────────┘
However, I've been trying to generate a week_mask
tuple for each of the records based on columns WKND1, WKND2
and apply it as part of my transformation so for the first record, the tuple should be:
(True, True, True, True, True, False, False)
Second Record would be:
(True, True, True, True, False, False, True)
and so on.
Based on the example below the actual response should be:
┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 ┆ NEW_DT │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str ┆ date │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday ┆ 2025-01-06 │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday ┆ 2025-01-14 │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null ┆ 2025-01-13 │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday ┆ 2025-01-16 │
└────────────┴────────┴──────────┴──────────┴────────────┘
How can I generate the tuple based on column values and apply it dynamically?
I tried creating a new column containing a list and using something like this:
df = df.with_columns(pl.col('DT').dt.add_business_days(pl.col('N_DAYS'), week_mask=pl.col('W_MASK')).alias('NEW_DT'))
but getting:
TypeError: argument 'week_mask': 'Expr' object cannot be converted to 'Sequence'
week_mask
supposed to be be Iterable
, so it seems you can't pass expression there.
You can iterate over different masks though:
pl.DataFrame.partition_by()
to split DataFrame into dict of dataframes.week_mask
out of partition key.pl.concat()
to concat result dataframes together.weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
pl.concat([
v.with_columns(
pl.col('DT').dt.add_business_days(
pl.col('N_DAYS'),
week_mask=[x not in k for x in weekdays]
).alias('NEW_DT')
) for k, v in df.partition_by('WKND1','WKND2', as_dict = True).items()
]).sort('DT')
shape: (4, 5)
┌────────────┬────────┬──────────┬──────────┬────────────┐
│ DT ┆ N_DAYS ┆ WKND1 ┆ WKND2 ┆ NEW_DT │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ i64 ┆ str ┆ str ┆ date │
╞════════════╪════════╪══════════╪══════════╪════════════╡
│ 2025-01-02 ┆ 2 ┆ Saturday ┆ Sunday ┆ 2025-01-06 │
│ 2025-01-09 ┆ 2 ┆ Friday ┆ Saturday ┆ 2025-01-13 │
│ 2025-01-10 ┆ 2 ┆ Saturday ┆ null ┆ 2025-01-13 │
│ 2025-01-15 ┆ 1 ┆ Saturday ┆ Sunday ┆ 2025-01-16 │
└────────────┴────────┴──────────┴──────────┴────────────┘