Search code examples
pythondatetimepandasdataframeleading-zero

Better way to change pandas date format to remove leading zeros?


DataFrame look like:

       OPENED
0  2004-07-28
1  2010-03-02
2  2005-10-26
3  2006-06-30
4  2012-09-21

I converted them to my desired format successfully but it seems very inefficient.

   OPENED
0   40728
1  100302
2   51026
3   60630
4  120921

The code that I used for the date conversion is:

df['OPENED'] = pd.to_datetime(df.OPENED, format='%Y-%m-%d')
df['OPENED'] = df['OPENED'].apply(lambda x: x.strftime('%y%m%d'))
df['OPENED'] = df['OPENED'].apply(lambda i: str(i))
df['OPENED'] = df['OPENED'].apply(lambda s: s.lstrip("0"))

Solution

  • You can use str.replace, then remove first 2 chars by str[2:] and last remove leading 0 by str.lstrip:

    print (type(df.ix[0,'OPENED']))
    <class 'str'>
    print (df.OPENED.dtype)
    object
    
    print (df.OPENED.str.replace('-','').str[2:].str.lstrip('0'))
    0     40728
    1    100302
    2     51026
    3     60630
    4    120921
    Name: OPENED, dtype: object
    

    If dtype is already datetime use strftime and str.lstrip:

    print (type(df.ix[0,'OPENED']))
    <class 'pandas.tslib.Timestamp'>
    print (df.OPENED.dtype)
    datetime64[ns]
    
    print (df.OPENED.dt.strftime('%y%m%d').str.lstrip('0'))
    0     40728
    1    100302
    2     51026
    3     60630
    4    120921
    Name: OPENED, dtype: object
    

    Thank you Jon Clements for comment:

    print (df['OPENED'].apply(lambda L: '{0}{1:%m%d}'.format(L.year % 100, L)))
    0     40728
    1    100302
    2     51026
    3     60630
    4    120921
    Name: OPENED, dtype: object