Search code examples
stringpandastextreplacestrip

In a pandas dataframe column, remove last 4 digit if it's 2017


In a pandas dataframe, there is a column X, with numbers like 12342017, 23456782017, WC456123, ER2017124. I want to remove the last four digit if it's '2017'

So, my desired output should be 1234,2345677,WC45612,ER2017124


Solution

  • Use Series.str.replace with $ for regex for end of string, also if possible mix numbers with strings first convert to strings:

    df = pd.DataFrame({'X': ['12342017', '23456782017', 'WC456123', 'ER2017124']})
    
    df['X'] = df['X'].astype(str).str.replace('2017$','')
    print (df)
               X
    0       1234
    1    2345678
    2   WC456123
    3  ER2017124