I am new to data analysis and python. I came acroos the following
df1.merge(df2, on='cnpj',suffixes=('','_y')).drop('index_y',axis=1)
for performing merge operations among the dataframes. I would like to know how the axis
and index_y
is determined?
The overlapping columns between df1 and df2 will me renamed according to the specified suffix, so "nothing" for df1 ("") and "_y" for df2.
Afterwards you drop the column "index_y" (which is the index of df2 with the suffix _y) by setting axis=1 (axis=0 along rows). This operation is performed after the merge to most likely get rid of redundant information.