Search code examples
pythonpandaslist-comprehensiontimedeltashift

Taking into account weekends/holidays when using timedeltas in pandas


So heres a problem it took me a while to figure why its happening...

I have a dataframe of values with a datetime index of workdays.

Im adding a new column, 'ColB', which is always the same value as colA, EXCEPT where colA is 0. Then I want to use the previous days value of colA.

For example:

Day A B
2019-11-11 7 7
2019-11-12 3 3
2019-11-13 0 3
2019-11-14 5 5
2019-11-15 0 5

Note how colB on 2019-11-13 becomes the value of colA from 2019-11-12. Same on the 15th: colB takes colA from the 14th.

Ive used this list comprehension:

df.loc[:,'colB'] = [df.loc[d-pd.Timedelta(1,'d'),'ColA'] if df.loc[d,'ColA']==0 else df.loc[d,'ColA'] for d in df.index]

I thought it wasnt working (KeyError: Timestamp('2019-11-03 00:00:00')), but I discovered the problem occurs because I have a zero on a monday, and therefore no rows for d-pd.Timedelta(1,'d') which would be a sunday.

I suppose the same problem will happen when I have a zero the day after a holiday too.

To solve it, when col A is zero, I actually need to use the previous value in colA for colB, and NOT the previous DAYS value, as Ive done in the list comprehension.

Ive solved it by reindexing, using sequential numbers rather than dates, but I think its nicer to use the dates.

I tried shift:

df.loc[:,'colB'] = [df.loc[d,'ColA'].shift(-1) if df.loc[d,'ColA']==0 else df.loc[d,'ColA'] for d in df.index]

but get an error: AttributeError: 'numpy.int64' object has no attribute 'shift'

Can anyone see a simple way to do this without reindexing?

Thanks!


Solution

  • Try this:

    df.assign(B = df['A'].replace(0))
    

    Output:

              Day  A  B
    0  2019-11-11  7  7
    1  2019-11-12  3  3
    2  2019-11-13  0  3
    3  2019-11-14  5  5
    4  2019-11-15  0  5