Search code examples
pythondatetimepandasdate-conversionweekday

Python/Pandas: Speedup calculation of weekday of datetime-data


I'm loading data in pandas, whereas the column date contains the datetime values, e.g.:

   date    ; .....more stuff ......
2000-01-03 ; 
2000-01-04 ;
2000-01-06 ;
...
2000-01-31 ;
2000-02-01 ;
2000-02-02 ;
2000-02-04 ;

I have a function to add a column containing the weekday-indices (0-6):

def genWeekdays(df,src='date',target='weekday'):
    """
    bla bla bla
    """
    df[target] = df[src].apply(lambda x: x.weekday())
    return df

calling it via

df = genWeekdays(df)

df has about a million rows and this takes about 1.3secs. Any way to speed this up? Im little surprised on how long this takes on my i7-4770k :( Thanks in advance


Solution

  • In [30]: df = DataFrame(dict(date = pd.date_range('20000101',periods=100000,freq='s'), value = np.random.randn(100000)))
    
    In [31]: df['weekday'] = pd.DatetimeIndex(df['date']).weekday
    
    In [32]: %timeit pd.DatetimeIndex(df['date']).weekday
    10 loops, best of 3: 34.9 ms per loop
    
    In [33]: df
    Out[33]: 
                         date     value
    
    In [33]: df
    Out[33]: 
                         date     value  weekday
    0     2000-01-01 00:00:00 -0.046604        5
    1     2000-01-01 00:00:01 -1.691611        5
    2     2000-01-01 00:00:02  0.416015        5
    3     2000-01-01 00:00:03  0.054822        5
    4     2000-01-01 00:00:04 -0.661163        5
    5     2000-01-01 00:00:05  0.274402        5
    6     2000-01-01 00:00:06 -0.426533        5
    7     2000-01-01 00:00:07  0.028769        5
    8     2000-01-01 00:00:08  0.248581        5
    9     2000-01-01 00:00:09  1.302145        5
    10    2000-01-01 00:00:10 -1.886830        5
    11    2000-01-01 00:00:11  2.276506        5
    12    2000-01-01 00:00:12  0.054104        5
    13    2000-01-01 00:00:13  0.378990        5
    14    2000-01-01 00:00:14  0.868879        5
    15    2000-01-01 00:00:15 -0.046810        5
    16    2000-01-01 00:00:16 -0.499447        5
    17    2000-01-01 00:00:17  1.067412        5
    18    2000-01-01 00:00:18 -1.625986        5
    19    2000-01-01 00:00:19  0.515884        5
    20    2000-01-01 00:00:20 -1.884882        5
    21    2000-01-01 00:00:21  0.943775        5
    22    2000-01-01 00:00:22  0.034501        5
    23    2000-01-01 00:00:23  0.438170        5
    24    2000-01-01 00:00:24 -1.211937        5
    25    2000-01-01 00:00:25 -0.229930        5
    26    2000-01-01 00:00:26  0.938805        5
    27    2000-01-01 00:00:27  0.026815        5
    28    2000-01-01 00:00:28  2.166740        5
    29    2000-01-01 00:00:29 -0.096927        5
    ...                   ...       ...      ...
    99970 2000-01-02 03:46:10 -0.310023        6
    99971 2000-01-02 03:46:11  0.561321        6
    99972 2000-01-02 03:46:12  2.207426        6
    99973 2000-01-02 03:46:13 -0.253933        6
    99974 2000-01-02 03:46:14 -0.711145        6
    99975 2000-01-02 03:46:15 -0.477377        6
    99976 2000-01-02 03:46:16  1.492970        6
    99977 2000-01-02 03:46:17  0.308510        6
    99978 2000-01-02 03:46:18  0.126579        6
    99979 2000-01-02 03:46:19 -1.704093        6
    99980 2000-01-02 03:46:20 -0.328285        6
    99981 2000-01-02 03:46:21  1.685411        6
    99982 2000-01-02 03:46:22 -0.368899        6
    99983 2000-01-02 03:46:23  0.915786        6
    99984 2000-01-02 03:46:24 -1.694855        6
    99985 2000-01-02 03:46:25 -1.488130        6
    99986 2000-01-02 03:46:26 -1.274004        6
    99987 2000-01-02 03:46:27 -1.508376        6
    99988 2000-01-02 03:46:28  0.551695        6
    99989 2000-01-02 03:46:29  0.007957        6
    99990 2000-01-02 03:46:30 -0.214852        6
    99991 2000-01-02 03:46:31 -1.390088        6
    99992 2000-01-02 03:46:32 -0.472137        6
    99993 2000-01-02 03:46:33 -0.969515        6
    99994 2000-01-02 03:46:34  1.129802        6
    99995 2000-01-02 03:46:35 -0.291428        6
    99996 2000-01-02 03:46:36  0.337134        6
    99997 2000-01-02 03:46:37  0.989259        6
    99998 2000-01-02 03:46:38  0.705592        6
    99999 2000-01-02 03:46:39 -0.311884        6
    
    [100000 rows x 3 columns]