TypeError: 'return_type' is an invalid keyword argument for split()

I have a df with start and end time columns. Since these columns might have gibberish values, I have put in try and except blocks. I'm trying to pad the times to make them consistent and then finally save them as pandas datetime.time values. Here's the code:

 for i in range(df.shape[0]):
        try:
            df.loc[i,'start time'] = pd.to_datetime(df.loc[i,'start time'].split(':', expand=True)
                                                     .apply(lambda col: col.str.zfill(2))
                                                     .fillna('00')
                                                     .agg(':'.join, axis=1)).dt.time
        except:
            pass
        try:
            df.loc[i,'end time'] = pd.to_datetime(df.loc[i,'end time'].str.split(':', expand=True)
                                                     .apply(lambda col: col.str.zfill(2))
                                                     .fillna('00')
                                                     .agg(':'.join, axis=1)).dt.time
        except:
            pass

But this piece of code gives an error: TypeError: 'expand' is an invalid keyword argument for split()

What am I missing here?

Solution

You are confusing pd.Series.str.split and str.split. In your case you are splitting a string not the series because you are iterating through the elements one by one

>>> '12:32:28'.split(':')
['12', '32', '28']

>>> '12:32:28'.split(':', expand=True)
...
TypeError: 'expand' is an invalid keyword argument for split()


>>> df['start_time'].str.split(':')
0      [2, 3, 4]
1     [2, 5, 55]
2     [2, 8, 46]
3    [2, 11, 37]
4    [2, 14, 28]
Name: start_time, dtype: object

>>> df['start_time'].str.split(':', expand=True)
   0   1   2
0  2   3   4
1  2   5  55
2  2   8  46
3  2  11  37
4  2  14  28

I think your code could be simply (without any loop)

>>> pd.to_datetime(df['start_time'], format='%H:%M:%S').dt.time
0    02:03:04
1    02:05:55
2    02:08:46
3    02:11:37
4    02:14:28
Name: start_time, dtype: object

Input dataframe:

>>> df
  start_time
0      2:3:4
1     2:5:55
2     2:8:46
3    2:11:37
4    2:14:28