I have a dataframe with one column that contains lists of lists. All the basic lists contain two strings. The lists that contain these basic lists have a variable amount of lists in them. example:
A
0 [[1,1],[1,1]]
1 [[1,1]]
2 [[1,1],[1,1],[1,1]]
I want a new dataframe that has two columns. The first has the first item in each basic list, the second column has the second item. I solved it this way:
df = pd.DataFrame(data = {'A': [[[1,1],[1,1]], [[1,1]], [[1,1],[1,1],[1,1]]]})
df2 = pd.DataFrame(columns = ['A', 'B'])
for x in df.A:
for i in x:
n = pd.DataFrame([i], columns = ('A', 'B'))
df2 = df2.append(n)
A B
0 1 1
0 1 1
0 1 1
0 1 1
0 1 1
0 1 1
I know it is not good to loop through a dataframe, but I couldn't figure out how. Here are some failed attempts:
for x in df1:
df2 = [df2.append(pd.DataFrame([i], columns = ('A', 'B'))) for i in x]
df2 = df1.apply(lambda x: df2.append(pd.DataFrame([x[0]], columns = ['name', 'tid'])))
If I had got the first list comprehension to work I would have tried to move the for loop to the end of the first list comprehension.
Thank you in advance for your help!
does this do the trick?
import pandas as pd
import itertools
df = pd.DataFrame(data = {'A': [[[1,1],[1,1]], [[1,1]], [[1,1],[1,1],[1,1]]]})
a = []
b = []
for k in range(len(df)):
a.append([x[0] for x in df.iloc[k].A])
b.append([x[1] for x in df.iloc[k].A])
df2 = df2 = pd.DataFrame(data = {'A': list(itertools.chain(*a)), 'B': list(itertools.chain(*b))})
Result:
>>> df2
A B
0 1 1
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1