I have a dataset with some of the column having identical column name. I want to merge column with same column name such that the values are appended as rows. And, for the columns that don't have column with same column name, 0 is appended in rows.
I have tried melting, but it doesn't seems to work for the format i need.
Sample Data:
print (df)
Date Column_A Column_A Column_B
0 1/2/2018 3 2 3
1 2/2/2018 4 7 1
2 3/2/2018 2 2 6
3 4/2/2018 1 1 4
Expected output:
Date Column_A Column_B
0 1/2/2018 3 3.0
1 2/2/2018 4 1.0
2 3/2/2018 2 6.0
3 4/2/2018 1 4.0
4 1/2/2018 2 0.0
5 2/2/2018 7 0.0
6 3/2/2018 2 0.0
7 4/2/2018 1 0.0
Idea is create MultiIndex
in columns with GroupBy.cumcount
, then reshape by DataFrame.stack
, sorting by second level of MultiIndex by DataFrame.sort_index
and last remove second level with convert first level to column Date
by double DataFrame.reset_index
:
df = df.set_index('Date')
s = df.columns.to_series()
df.columns = [df.columns, s.groupby(s).cumcount()]
df = df.stack().sort_index(level=1).fillna(0).reset_index(level=1, drop=True).reset_index()
print (df)
Date Column_A Column_B
0 1/2/2018 3 3.0
1 2/2/2018 4 1.0
2 3/2/2018 2 6.0
3 4/2/2018 1 4.0
4 1/2/2018 2 0.0
5 2/2/2018 7 0.0
6 3/2/2018 2 0.0
7 4/2/2018 1 0.0