Create multiindex Dataframe with column containing a list

I'm having a pandas dataframe that looks like this in E I have a list

A	B	C	D	E
1	1	1	1	[obj1, obj2,obj3]
1	1	2	1	empty list
1	1	2	1	[obj1]
1	2	2	1	[obj1, obj2]
2	1	1	1	empty list
2	1	2	1	[obj1, obj2,obj3]

I want to make it a multiindex (or any kind of dataframe) that looks like this. That I have a row for each value of the list, but without duplicting the values

A	B	C	D	E
1	1	1	1	obj1
				obj2
				obj3
1	1	2	1
1	1	2	1	obj1
1	2	2	1	obj1
				obj2
2	1	1	1
2	1	2	1	obj1
				obj2
				obj3

But I cannot figure out how to do it.

I tried using

dfexplode(['E'],ignore_index=True)

but this enflattens the list but duplicated the column values A-D. So when I count the values, I also count the duplicates.

Any ideas?

Solution

You can explode your data and reset the index to compare it with the shifted one. This mask allows you to overwrite all values for a-d where the index equals the shifted index. The following code does this:

import pandas as pd

df_exploded = df.explode(column="e").reset_index()
df_exploded.loc[df_exploded["index"] == df_exploded["index"].shift(), df.columns[:-1]] = ""
df_exploded = df_exploded.drop(columns="index")

It results in the following table:

    a  b  c  d    e
0   1  1  1  1   10
1                11
2                12
3   1  1  2  1  NaN
4   1  1  2  1    3
5   1  2  2  1   40
6                20
7   2  1  1  1  NaN
8   2  1  2  1   30
9                12
10               11

Note, if you want numeric columns, you can also assign nan values instead of empty strings. To do so, just adjust the following line:

df_exploded.loc[df_exploded["index"] == df_exploded["index"].shift(), df.columns[:-1]] = pd.NA # or np.nan if you want to use numpy