Search code examples
pythonpandastimestampunix-timestampvalueerror

Combining date and time (timestamp) in a list of dataframes


I wish to combine date and time into one timestamp on a list of dataframes and to designate a week to the call date.

Here is the error : ValueError: could not convert string to Timestamp

I have used the following function :

def new_call_time(df):
    i=0
    df[' CALL_DATE_MANIPULATED']=str(df['CALL_DATE'][i]).split()[0] + ' ' + str(df['CALL_TIME'][i]) 
    df[' UNIX_TIME']= pd.Timestamp(df[' CALL_DATE_MANIPULATED'][i]).value//10 ** 9 
    df[' WEEK']=''

    for i in range(len(df)):
        df[' CALL_DATE_MANIPULATED'][i]=str(df['CALL_DATE'][i]).split()[0] + ' ' + str(df['CALL_TIME'][i])
        df[' UNIX_TIME'][i]= pd.Timestamp(df[' CALL_DATE_MANIPULATED'][i]).value// 10 ** 9
        df[' WEEK'][i]=df[' UNIX_TIME'][i]//604800

return df

Here is the function call statement :

for df in data_frame : 
    df = new_call_time(df)

Here are the tables I have read from excel sheets (contained in a list called data_frame) :

    CALL_DATE    CALL_TIME
    01-JAN-2016  00:15:06   
    01-JAN-2016  07:07:00

    CALL_DATE    CALL_TIME
    01-JAN-2016  08:40:38   
    01-JAN-2016  08:44:14   

    CALL_DATE    CALL_TIME
    01-JAN-2016  08:51:10   
    01-JAN-2016  09:06:31

This is working for an individual dataframe but isn't working for a list of dataframes.

The new tables should have the following columns too : example : data_frame[0] -

     CALL_DATE       CALL_TIME  CALL_DATE_MANIPULATED UNIX_TIME  WEEK
     01-JAN-2016     00:15:06   01-JAN-2016 00:15:06  1451607306 2400   
     01-JAN-2016     07:07:00   01-JAN-2016 07:07:00  1451632020 2400

Thanks a lot :)))


Solution

  • Use Series.str.split with indexing str[0] and then to_datetime with parameters errors='coerce' for NaT if some values not matched format defined in format (parameter format is for better performance, but should be omitted):

    def new_call_time(df):
        df['CALL_DATE_MANIPULATED'] = (df['CALL_DATE'].astype(str).str.split().str[0] + ' ' + 
                                       df['CALL_TIME'].astype(str))
        dates = pd.to_datetime(df['CALL_DATE_MANIPULATED'], 
                               errors='coerce', 
                               format='%d-%b-%Y %H:%M:%S')
        df['UNIX_TIME'] = dates.values.astype(np.int64) // 10 ** 9
        df['WEEK'] = df['UNIX_TIME'] //604800
        return df
    

    Call function in list comprehension for new list of DataFrames:

    data_frame1 = [new_call_time(df) for df in data_frame]