I have the following two frames:
frame1:
id
0 111-111-111
1 111-111-222
2 222-222-222
3 333-333-333
frame2:
data id
0 ones 111-111
1 threes 333-333
And, I have a lambda function that maps the frame1.id
to frame2.id
:
id_map = lambda x: x[:7]
My goal is to perform an inner join between these two tables, but to have the id go through the lambda. So that the output is:
id data
0 111-111-111 ones
1 111-111-222 ones
2 333-333-333 threes
I've come up with a rather non-elegant solution that almost does what I'm trying to do, however it messes up when the inner join removes rows:
# Save a copy the original ids of frame1
frame1_ids = frame1['id'].copy()
# Apply the id change to frame1
frame1['id'] = frame1['id'].apply(id_map)
# Merge
frame1 = frame1.merge(frame2, how='inner', on='id')
# Set the ids back to what they originally were
frame1['id'] = frame1_ids
Is there a elegant solution for this?
Could use assign
to create a dummy id column (newid) to join on like this:
frame1.assign(newid=frame1['id'].str[:7])
.merge(frame2, left_on='newid', right_on='id', suffixes=('','_y'))
.drop(['id_y','newid'], axis=1)
Output:
id data
0 111-111-111 ones
1 111-111-222 ones
2 333-333-333 threes