I'd like the dataframe passed into this function to be modified.
def func(df):
left_df = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'])
right_df = pd.DataFrame([[5, 6], [7, 8]], columns=['C', 'D'])
df = pd.merge(left_df, right_df, how='outer', left_index=True, right_index=True)
print("df is now a merged dataframe!")
test = pd.DataFrame()
func(test)
However, since Python passes by value, the callee func()
gets a copy of df
which points to the original empty dataframe. When it is assigned to the merged dataframe, it creates a new object returned by pd.merge()
and points df
to this new object. However, test
is unchanged and continues pointing to the original empty dataframe.
How can we merge inplace in func()
so test
is actually changed? I'd like something like pandas.DataFrame.update()
, but this only lets you do left joins.
IIUC, something like this?
def func(df):
left_df = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'])
right_df = pd.DataFrame([[5, 6], [7, 8]], columns=['C', 'D'])
df = pd.merge(left_df, right_df, how='outer', left_index=True, right_index=True)
print("df is now a merged dataframe!")
global test
test = df
test = pd.DataFrame()
func(test)
print(test)
Output:
df is now a merged dataframe!
A B C D
0 1 2 5 6
1 3 4 7 8