Search code examples
pythonpython-3.xpandasdataframecsv

Update one column of one csv based on column from another in Python


I have two csv s like below :-

Master.csv

Column1 Column2 Column3 Column4
First 1001 Test1 1003
Second 1002 Test2 1004

Reference.csv

Code Value
1001 Val1
1002 Val2
1003 Val3
1004 Val4

As can be seen the Column2 and Column4 of Master.csv values are present in the same column Code of Reference.csv and I want to join the two csv to populate like below :-

Column1 Column2 Column3 Column4
First Val1 Test1 Val3
Second Val2 Test2 Val4

Please let me know it can be achieved through Python ..

I tried to join to csvs but unable to get the expected outputs.


Solution

  • Create DataFrames first by read_csv and use DataFrame.replace:

    df1 = pd.read_csv('Master.csv')
    df2 = pd.read_csv('Reference.csv', index_col=['Code'])
    
    out = df1.replace(df2['Value'])
    print (out)
      Column1 Column2 Column3 Column4
    0   First    Val1   Test1    Val3
    1  Second    Val2   Test2    Val4
    

    If need write to csv:

    out.to_csv('output.csv', index=False)
    

    If need replace only columns from list:

    df1 = pd.read_csv('Master.csv')
    df2 = pd.read_csv('Reference.csv', index_col=['Code'])
    
    cols = ['Column2', 'Column4']
    df1[cols] = df1[cols].replace(df2['Value'])
    
    df1.to_csv('output.csv', index=False)