Search code examples
pythonexcelpandasdatetimedatetime-conversion

How to convert No of days to Hours in python


I have a Age.xlsx where I am calculating the age of incidents in no of Days(Granular level I have to found Age in Hours) I am stuck with a situation, I have to calculate no of days with below two conditions.

  1. If the completion date is given then No of days= Completed date-submit date
  2. If the Completed date is NaT(Or blank)(If incident still open then No of days=L1+L2

Additionally, How can I convert days to hours eg. "2 days 08:00:00" to "56 hours"?

Sample data:

    Incident    Status      Priority    Team    Submit Date          Completed Date     L1  L2  No_of_days Age_in_Hours
    INC0011     Pending     Medium      L2      2020-06-04 00:00:00                     1   1             
    INC0012     completed   High        L2      2020-06-04 00:00:00 2020-06-08 02:00:00 2   2      
    INC0013     In progress Low L1              2020-06-05 00:00:00                     1   2      
    INC0014     Assigned    Medium      L1      2020-06-05 00:00:00                     1   3      
    INC0015     completed   High        L2      2020-06-05 00:00:00 2020-06-05 01:00:00 0   1       

Sampel code:

data = pd.read_excel('C:\Age.xlsx')
x=(data['Completed Date']- data['Submit Date'])

Solution

  • You can create a boolean mask based on whether or not the completed date column is filled, and use loc to conditionally fill the number of days column:

    • if there is a completion date, take completion date minus start date
    • it not, take the sum of L1 and L2 and make a pandas.Timedelta for that many days

    This will make the "No_of_days" column be all Timedelta, which you can then easily convert to hours by division:

    #mask is where "Completed Date" is null
    mask = df['Completed Date'].isnull()
    
    df.loc[~mask, 'No_of_days'] = df['Completed Date'][~mask] - df['Submit Date'][~mask]
    df.loc[mask, 'No_of_days'] = pd.to_timedelta(df['L1'][mask] + df['L2'][mask], unit='days')
    
    df['Age_in_Hours'] = df['No_of_days']/pd.Timedelta(hours=1)
    

    Note that there is a difference in precision between times with completed dates and times without (as the L1 and L2 are not providing hours).