I have 2 csv files
one.csv
Supplier Code | Cost Price |
---|---|
1 | 2.25 |
2 | 1.25 |
3 | 3.25 |
two.csv
Supplier Code | Cost Price |
---|---|
1 | 2.25 |
2 | 1.75 |
3 | 3.75 |
5 | 1.25 |
So one.csv
is compared to two.csv
ONE.csv has about 500 values in 2 columns. TWO.csv has 10000 values in 2 columns. The shape of the files is same. Code column does not change, but is either present or absent in ONE.csv/TWO.csv
What I am looking to do is:
Find subset of the all Supplier Code in ONE.csv that are present in TWO.csv, with Cost Price updated from TWO.csv
Output that to CSV file
Expected output:
Supplier Code,Cost Price
1,2.25
2,1.75
3,3.75
This is what I have tried...
one = pd.read_csv(ONE_PATH)
two = pd.read_csv(TWO_PATH)
present_in_two = one[one['Supplier Code'].isin(two['Supplier Code'])]
print(present_in_two.head())
I get:
Empty DataFrame
Columns: [Supplier Code, Cost Price]
Index: []
Edit: Corrected typo..
This works for your use case.
Merge the two dataframes on key Supplier Code
and select only the key column from df_one
when merging.
The data form df_two
would be selected which also exist on df_one
dict_one = {"Supplier Code": [1, 2, 3], "Cost Price": [2.25, 1.25, 3.25]}
dict_two = {"Supplier Code": [1, 2, 3, 5], "Cost Price": [2.25, 1.75, 3.75, 1.25]}
df_one = pd.DataFrame(dict_one)
df_two = pd.DataFrame(dict_two)
df_merged = pd.merge(df_one[["Supplier Code"]], df_two, on="Supplier Code")
df_merged
Supplier Code Cost Price
0 1 2.25
1 2 1.75
2 3 3.75