Currently my script is subtracting my current time with the times that i have in a Dataframe column called "Creation", generating a new column with the days of the difference. I get the difference days with this code:
df['Creation']= pandas.to_datetime(df["Creation"],dayfirst="True")
#Generates new column with the days.
df['Difference'] = df.to_datetime('now') - df['Creation']
What i want to now is for it to give me the days like hes giving me but dont count the Saturdays and Sundays. How can i do that ?
you can make use of numpy
's busday_count, Ex:
import pandas as pd
import numpy as np
# some dummy data
df = pd.DataFrame({'Creation': ['2021-03-29', '2021-03-30']})
# make sure we have datetime
df['Creation'] = pd.to_datetime(df['Creation'])
# set now to a fixed date
now = pd.Timestamp('2021-04-05')
# difference in business days, excluding weekends
# need to cast to datetime64[D] dtype so that np.busday_count works
df['busday_diff'] = np.busday_count(df['Creation'].values.astype('datetime64[D]'),
np.repeat(now, df['Creation'].size).astype('datetime64[D]'))
df['busday_diff'] # since I didn't define holidays, potential Easter holiday is excluded:
0 5
1 4
Name: busday_diff, dtype: int64
If you need the output to be of dtype timedelta
, you can easily cast to that via
df['busday_diff'] = pd.to_timedelta(df['busday_diff'], unit='d')
df['busday_diff']
0 5 days
1 4 days
Name: busday_diff, dtype: timedelta64[ns]
Note: np.busday_count
also allows you to set a custom weekmask (exclude days other than Saturday and Sunday) or a list of holidays. See the docs I linked on top.
Related: Calculate difference between two dates excluding weekends in python?, how to use (np.busday_count) with pandas.core.series.Series