I am trying to plot a histogram for missing NaN values across all features of a dataframe For that i created a dataframe for missing NaN values
Missing value Dataframe
0
-----
0 0
1 14
2 800
.
.
84 2344
Then i have this master dataframe that has multiple columns i am not concerned about since i want only the row names from this dataframe
Master Dataframe
0 1
---------
F1 3 3
F2 4 3
.
.
F85 5 2
How can i merge/ concatenate these 2 dataframes where the final output should be like (Columns in master dataframe are irrelevant since i want to plot number of missing values across all features i.e. F1,F2,...F85)
F1 0
F2 14
F3 800
.
.
F85 2344
Asume your dataframes are df1
(Missing value Dataframe) and df2
(Master dataframe). Then you can try this:
df1.columns=['X']
res = df2.reset_index().join(df1.reset_index(), rsuffix='_r')[['index', 'X']].set_index('index')
print(res)
Result will be:
index
F1 0
F2 14
F3 800
...
F85 2344
The idea is to use reset_index to replace index in both dataframes with row numbers and then merge dataframes