Search code examples
pythonpython-3.xpandasdataframexlsx

How to replace specific words in specific column with another from excel


  • I've problem with replacing, here's what I wrote, I need to replace 1999 with 1900 as you can see. I started recently, so please excuse me. (I searched a lot and watched clips on YouTube, but the method didn't work.)
        import pandas as pd
        df = pd.read_excel('book1.xlsx')

        #replace
        df.replace("1999","1900")

        #I also tried this method, but it didn't work.
        #df.replace(to_replace = "1999", value = "1900")

        #writer
        writer = pd.ExcelWriter('book2.xlsx')
        df.to_excel(writer,'new_sheet')
        writer.save()

  • My second question, how can I replace data through a text file (or Excel), for example, replace 1999 (in column A, book1.xlsx) with the column b in mistakes.xlsx.
 A     B
 1999   1900

Thank you guys for the help.


Solution

  • You could define a function and apply it element-wise with Series.apply:

    df = pandas.DataFrame.from_records([('Cryptonomicon', 1999), ('Snow Crash', 1992), ('Quicksilver', 2003)], columns=['Title', 'Year'])
    # df is:
    #             Title  Year
    #  0  Cryptonomicon  1999 
    #  1     Snow Crash  1992
    #  2    Quicksilver  2003
    
    
    
    # Imagine this dataframe came from an Excel spreadsheet...
    df_replacements = pandas.DataFrame.from_records([(1999, 1900), (2003, 3003)], columns=['A', 'B'])
    
    replacements = pandas.Series(df_replacements['B'].values, index=df_replacements['A'])
    
    def replaced(value):
        return replacements.get(value, value)
    
    df['Year'] = df['Year'].apply(replaced)
    
    # df is:
    #             Title  Year
    #  0  Cryptonomicon  1900
    #  1     Snow Crash  1992
    #  2    Quicksilver  3003
    

    If you have a very large dataframe, you could vectorize this using pandas.Series.map():

    year = df['Year']
    df['Year'] = year.where(~year.isin(replacements.keys()), 
                            year,
                            year.map(replacements))