Search code examples
pythonpandasdatetimetimedelta

Python Datetime conversion for excel dataframe


1st Image - Excel Sheel(Binary format), 2nd Image - Python extraction output with pandas.read_excel, 3rd Image - After conversion with datetime Hello, I am trying to extract date and time column from my excel data. I am getting column as DataFrame with float values, after using pandas.to_datetime I am getting date with different date than actual date from excel. for example, in excel starting date is 01.01.1901 00:00:00 but in python I am getting 1971-01-03 00:00:00.000000 like this. How can I solve this problem?

Final Output required

I need a final output in total seconds with DataFrame. First cell starting as a 00 sec and very next cell with timestep of seconds (time difference in ever cell is 15min.)

Thank you.


Solution

  • Your input is fractional days, so there's actually no need to convert to datetime if you want the duration in seconds relative to the first entry. Subtract that from the rest of the column and multiply by the number of seconds in a day:

    import pandas as pd
    
    df = pd.DataFrame({"Datum/Zeit": [367.0, 367.010417, 367.020833]})
    
    df["totalseconds"] = (df["Datum/Zeit"] - df["Datum/Zeit"].iloc[0]) * 86400
    
    df["totalseconds"]
    
    0       0.0000
    1     900.0288
    2    1799.9712
    Name: totalseconds, dtype: float64
    

    If you have to use datetime, you'll need to convert to timedelta (duration) to do the same, e.g. like

    df["datetime"] = pd.to_datetime(df["Datum/Zeit"], unit="d")
    
    # df["datetime"]
    # 0   1971-01-03 00:00:00.000000
    # 1   1971-01-03 00:15:00.028800
    # 2   1971-01-03 00:29:59.971200
    # Name: datetime, dtype: datetime64[ns]
    
    # subtraction of datetime from datetime gives timedelta, which has total_seconds:
    df["totalseconds"] = (df["datetime"] - df["datetime"].iloc[0]).dt.total_seconds()
    
    # df["totalseconds"]
    # 0       0.0000
    # 1     900.0288
    # 2    1799.9712
    # Name: totalseconds, dtype: float64