Search code examples
pythondatetimepandasunix-timestamp

Convert (back and forth) UNIX timestamp to pandas.tslib.Timestamp and datetime for series


I am working with python 3.5.2, pandas 0.18.1 and sqlite3.

In my data base, I have a column unix_time with INT for seconds since 1970. Ideally I want to read my dataframe from sqlite, and then create a time column which would correspond to the datetime or pandas.tslib.Timestamp conversion of the unix_time column that I woul only use for some processing and then drop before saving the dataframe back.

The issue is that when parsing the unix_time column using :

df = pd.read_from_sql_query("SELECT * FROM test", con, parse_dates=['unix_time'])

I obtain pandas.tslib.Timestamp types which is fine for my processing, but then I have to recreate my original unix_time column using :

df['unix_time'][i] = (df['unix_time'][i] - datetime(1970,1,1)).total_seconds()

which is really 'dirty'

First question : Do you have a better way?

I thought about giving up the unix time format and only use datetime format but the to_datetime method from pandas returns in fact pandas.tslib.Timestamp ... And anyway, doing so would force me to iterate over all rows which is a bad solution. (It is impossible to apply to_datetime on something else than a view over a single cell of the dataframe

Second question : Is it possible to apply it on a series?

My last try was with directly using df['time'] = datetime.datetime.fromtimestamp(df['unix_time']) but surprisingly, it also returns pandas.tslib.Timestamp.

In the end, knowing that I can only save unix timestamps or datetimes, my only choices for the moment are :

  • parsing but then having to convert them back to unix timestamp one by one.

  • Or not parse it but have to convert them to pandas.tslib.Timestamp one by one.

It would be great if I could convert a whole series.

Last question : Is there a way to convert a unix timestamps series to datetime (or at least pandas.tslib.Timestamp), or a pandas.tslib.Timestamp (or datetime) series to unix timestamps?

Thanks

EDIT: During my processing, I extract a row that I want to append to my dataset. Apparently, the coversion to pandas.tslib.Timestamp appends implicitly when passing from dataframe to serie :

df = pd.DataFrame({'UNX':pd.date_range('2016-01-01', freq='9999S', periods=10).astype(np.int64)//10**9})
df['Date'] = pd.to_datetime(df.UNX, unit='s')
print(df.Date.dtypes)
print(type(df['Date'][0]))
test = df.iloc[0]
print(type(test.Date))
new_df = test.to_frame().transpose()   #from here, impossible to do : new_df.to_sql("test", con) because the type for 'Date' is not supported
print(new_df.Date.dtypes)

returns

datetime64[ns]
<class 'pandas.tslib.Timestamp'>
<class 'pandas.tslib.Timestamp'>
object

Is there a way to convert the 'Date' in new_df from pandas.tslib.Timestamp to datetime64[ns] or datetime.datetime (or simply str) ?


Solution

  • IIUC you can do it this way:

    In [96]: df = pd.DataFrame({'UNX':pd.date_range('2016-01-01', freq='9999S', periods=10).astype(np.int64)//10**9})
    
    In [97]: df
    Out[97]:
              UNX
    0  1451606400
    1  1451616399
    2  1451626398
    3  1451636397
    4  1451646396
    5  1451656395
    6  1451666394
    7  1451676393
    8  1451686392
    9  1451696391
    

    Convert UNIX epoch to Python datetime:

    In [98]: df['Date'] = pd.to_datetime(df.UNX, unit='s')
    
    In [99]: df
    Out[99]:
              UNX                Date
    0  1451606400 2016-01-01 00:00:00
    1  1451616399 2016-01-01 02:46:39
    2  1451626398 2016-01-01 05:33:18
    3  1451636397 2016-01-01 08:19:57
    4  1451646396 2016-01-01 11:06:36
    5  1451656395 2016-01-01 13:53:15
    6  1451666394 2016-01-01 16:39:54
    7  1451676393 2016-01-01 19:26:33
    8  1451686392 2016-01-01 22:13:12
    9  1451696391 2016-01-02 00:59:51
    

    Convert datetime to UNIX epoch:

    In [100]: df['UNX2'] = df.Date.astype('int64')//10**9
    
    In [101]: df
    Out[101]:
              UNX                Date        UNX2
    0  1451606400 2016-01-01 00:00:00  1451606400
    1  1451616399 2016-01-01 02:46:39  1451616399
    2  1451626398 2016-01-01 05:33:18  1451626398
    3  1451636397 2016-01-01 08:19:57  1451636397
    4  1451646396 2016-01-01 11:06:36  1451646396
    5  1451656395 2016-01-01 13:53:15  1451656395
    6  1451666394 2016-01-01 16:39:54  1451666394
    7  1451676393 2016-01-01 19:26:33  1451676393
    8  1451686392 2016-01-01 22:13:12  1451686392
    9  1451696391 2016-01-02 00:59:51  1451696391
    

    Check:

    In [102]: df.UNX.eq(df.UNX2).all()
    Out[102]: True