I am having a dataset which look like follows(in dataframe):
**_id** **paper_title** **references** **full_text**
1 XYZ [{'abc':'something','def':'something'},{'def':'something'},...many others] something
2 XYZ [{'abc':'something','def':'something'},{'def':'something'},...many others] something
3 XYZ [{'abc':'something'},{'def':'something'},...many others] something
Expected:
**_id** **paper_title** **abc** **def** **full_text**
1 XYZ something something something
something something
.
.
(all the dic in list with respect to_id column)
2 XYZ something something something
something something
.
.
(all the dic in list with respect to_id column)
I have tried df['column_name'].apply(pd.Series).apply(pd.Series)
to split the list and dictionaries into columns of dataframe but doesn't help as it didn't split dictionaries.
First row of my dataframe: df.head(1)
After a lot of Documentation reading of pandas, I found the explode method applying with apply(pd.Series) is the easiest of what I was looking for in the question.
Here is the Code:
df = df.explode('reference')
# It explodes the lists to rows of the subset columns
df = df['reference'].apply(pd.Series).merge(df, left_index=True, right_index=True, how ='outer')
# split a list inside a Dataframe cell into rows and merge with original dataframe like (AUB) in set theory
Sidenote: while merging look for unique values in column as there will many columns with duplicated values
I hope this helps someone with dataframe/Series with columns having list containing multiple dictionaries and want to split list of multiple dictionaries key to new column with values as their rows.