I have two df I wish to merge. Examples:
dfone
Centre Name open click
0 12345 Ed 4 7
1 67890 Dave 8 3
2 54321 Mary 5 6
3 09876 Susan 4 3
4 24680 Tom 9 7
5 13579 Will 6 2
dftwo
Centre Name Train Online Home
0 67890 Dave Y Y N
1 13579 Will N Y Y
2 09876 Susan Y Y Y
3 24680 Tom N N Y
4 54321 Mary Y Y N
5 12345 Ed N N N
I use the following to merge the two dfs:
dfthree = pd.merge(dfone, dftwo, left_on='Centre', right_on='Centre', how='left',
suffixes=('','_y')).filter(regex='^(?!.*_y)')
My desired output is:
Centre Name open click Train Online Home
0 12345 Ed 4 7 N N N
1 67890 Dave 8 3 Y Y N
2 54321 Mary 5 6 Y Y N
3 09876 Susan 4 3 Y Y Y
4 24680 Tom 9 7 N N Y
5 13579 Will 6 2 N Y Y
My ACTUAL output is:
Centre Name open click Train Online Home
0 12345 Ed 4 7 N N N
1 67890 Dave 8 3 N N N
2 54321 Mary 5 6 N N N
3 09876 Susan 4 3 Nan Nan Nan
4 24680 Tom 9 7 Nan Nan Nan
5 13579 Will 6 2 Nan Nan Nan
Where am I going wrong please and what should I be doing instead?
Example
import pandas as pd
data1 = {'Centre': {0: 12345, 1: 67890, 2: 54321, 3: 9876, 4: 24680, 5: 13579},
'Name': {0: 'Ed', 1: 'Dave', 2: 'Mary', 3: 'Susan', 4: 'Tom', 5: 'Will'},
'open': {0: 4, 1: 8, 2: 5, 3: 4, 4: 9, 5: 6},
'click': {0: 7, 1: 3, 2: 6, 3: 3, 4: 7, 5: 2}}
data2 = {'Centre': {0: 67890, 1: 13579, 2: 9876, 3: 24680, 4: 54321, 5: 12345},
'Name': {0: 'Dave', 1: 'Will', 2: 'Susan', 3: 'Tom', 4: 'Mary', 5: 'Ed'},
'Train': {0: 'Y', 1: 'N', 2: 'Y', 3: 'N', 4: 'Y', 5: 'N'},
'Online': {0: 'Y', 1: 'Y', 2: 'Y', 3: 'N', 4: 'Y', 5: 'N'},
'Home': {0: 'N', 1: 'Y', 2: 'Y', 3: 'Y', 4: 'N', 5: 'N'}}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
Code
df3 = df1.merge(df2, how='left')
df3
Centre Name open click Train Online Home
0 12345 Ed 4 7 N N N
1 67890 Dave 8 3 Y Y N
2 54321 Mary 5 6 Y Y N
3 9876 Susan 4 3 Y Y Y
4 24680 Tom 9 7 N N Y
5 13579 Will 6 2 N Y Y
Applying the code to an well-made example before you apply it to your dataset will reveal the cause of most problems. Your problem is not a problem with code.
if the merge result is NaN instead of the desired result, the actual data is the same when you see it with your eyes, but in reality it is often different.
ex) white space, case, data type, etc.
'abc' ≠ ' abc'
'abc' ≠ 'Abc'
1 ≠ '1'
merge
need exact same key, unless return NaN. The ex above is a typical example of what we think is the same key but different.