The following two columns in a dataframe called 'question' are copied from a lager dataframe
{'Breddat': {0: Timestamp('2010-05-04 00:00:00'),
1: Timestamp('2011-02-02 00:00:00'),
2: Timestamp('2010-05-04 00:00:00'),
3: Timestamp('2011-04-27 00:00:00'),
4: Timestamp('2012-01-24 00:00:00'),
5: NaT,
6: Timestamp('2015-02-06 00:00:00'),
7: Timestamp('2016-02-04 00:00:00')},
'Result': {0: 176.0,
1: 97.0,
2: 162.0,
3: 112.0,
4: 81.0,
5: nan,
6: 87.0,
7: 97.0}}
If I delete the row with the missing 'Result' value then the following code to calculate 'Cdat' works
question['Cdat'] = (question['Breddat']) - question['Result'].map(dt.timedelta)
I cannot delete all the rows in the main dataframe that have missing values (they are required for other things). I have unsuccessfully tried different approaches to exclude rows with missing values from the calculation without success. I am new to python and pandas and appear to be missing something basic.
Appreciate any help calculating conception date (Cdat) when there are rows with missing values.
Use pd.to_timedelta
instead of datetime.timestamp
to deal with NaN
values.
question['Cdat'] = question['Breddat'] - pd.to_timedelta(question['Result'], unit='D')
>>> question
Breddat Result Cdat
0 2010-05-04 176.0 2009-11-09
1 2011-02-02 97.0 2010-10-28
2 2010-05-04 162.0 2009-11-23
3 2011-04-27 112.0 2011-01-05
4 2012-01-24 81.0 2011-11-04
5 NaT NaN NaT
6 2015-02-06 87.0 2014-11-11
7 2016-02-04 97.0 2015-10-30