I have the following dataframe:
df = pd.DataFrame({'user': ['Andrea', 'Gioele'],
'year': [1983, 2014],
'month': [11, 1],
'day': [8, 11]} )
Then I create the date for every row in two ways. First:
df['dateA'] = df.apply(lambda x: datetime.date(x['year'],x['month'],x['day']), axis=1)
Second:
df['dateB'] = pd.to_datetime(df[['year','month','day']])
I have the following dataframe:
>>> df
10: day month user year dateA dateB
0 8 11 Andrea 1983 1983-11-08 1983-11-08
1 11 1 Gioele 2014 2014-01-11 2014-01-11
I have two different formats:
>>> df['dateA']
1983-11-08
2014-01-11
Name: dateA, dtype: object
>>> df['dateB']
1983-11-08
2014-01-11
Name: dateB, dtype: datetime64[ns]
Moreover:
>>> df['dateA'].iloc[0]
datetime.date(1983, 11, 8)
>>> df['dateB'].iloc[0]
Timestamp('1983-11-08 00:00:00')
The problem is that computing the date with the first method is quite expensive, so I would like to transform the df['dateB']
such that it has the format 'object'. Is there a way?
Note: I have already tried what the possible "duplicated questions" suggest (they have always strings, not timestamps), but i obtain the following
>>> datetime.datetime.fromtimestamp(df['dateB'].iloc[0])
Traceback (most recent call last):
File "<pyshell#68>", line 1, in <module>
datetime.datetime.fromtimestamp(df['dateB'].iloc[0])
TypeError: a float is required
I think you can use dt.date
:
df['dateB'] = pd.to_datetime(df[['year','month','day']]).dt.date
print (df['dateB'].dtype)
object
print (type(df['dateB'].iloc[0]))
<class 'datetime.date'>