I wish to combine date and time into one timestamp on a list of dataframes and to designate a week to the call date.
Here is the error : ValueError: could not convert string to Timestamp
I have used the following function :
def new_call_time(df):
i=0
df[' CALL_DATE_MANIPULATED']=str(df['CALL_DATE'][i]).split()[0] + ' ' + str(df['CALL_TIME'][i])
df[' UNIX_TIME']= pd.Timestamp(df[' CALL_DATE_MANIPULATED'][i]).value//10 ** 9
df[' WEEK']=''
for i in range(len(df)):
df[' CALL_DATE_MANIPULATED'][i]=str(df['CALL_DATE'][i]).split()[0] + ' ' + str(df['CALL_TIME'][i])
df[' UNIX_TIME'][i]= pd.Timestamp(df[' CALL_DATE_MANIPULATED'][i]).value// 10 ** 9
df[' WEEK'][i]=df[' UNIX_TIME'][i]//604800
return df
Here is the function call statement :
for df in data_frame :
df = new_call_time(df)
Here are the tables I have read from excel sheets (contained in a list called data_frame) :
CALL_DATE CALL_TIME
01-JAN-2016 00:15:06
01-JAN-2016 07:07:00
CALL_DATE CALL_TIME
01-JAN-2016 08:40:38
01-JAN-2016 08:44:14
CALL_DATE CALL_TIME
01-JAN-2016 08:51:10
01-JAN-2016 09:06:31
This is working for an individual dataframe but isn't working for a list of dataframes.
The new tables should have the following columns too : example : data_frame[0] -
CALL_DATE CALL_TIME CALL_DATE_MANIPULATED UNIX_TIME WEEK
01-JAN-2016 00:15:06 01-JAN-2016 00:15:06 1451607306 2400
01-JAN-2016 07:07:00 01-JAN-2016 07:07:00 1451632020 2400
Thanks a lot :)))
Use Series.str.split
with indexing str[0]
and then to_datetime
with parameters errors='coerce'
for NaT
if some values not matched format defined in format
(parameter format is for better performance, but should be omitted):
def new_call_time(df):
df['CALL_DATE_MANIPULATED'] = (df['CALL_DATE'].astype(str).str.split().str[0] + ' ' +
df['CALL_TIME'].astype(str))
dates = pd.to_datetime(df['CALL_DATE_MANIPULATED'],
errors='coerce',
format='%d-%b-%Y %H:%M:%S')
df['UNIX_TIME'] = dates.values.astype(np.int64) // 10 ** 9
df['WEEK'] = df['UNIX_TIME'] //604800
return df
Call function in list comprehension for new list of DataFrames
:
data_frame1 = [new_call_time(df) for df in data_frame]