Search code examples
pandasdataframefor-looplist-comprehensionnested-loops

Separating lists into a new dataframe


I have a dataframe with one column that contains lists of lists. All the basic lists contain two strings. The lists that contain these basic lists have a variable amount of lists in them. example:

            A
    0 [[1,1],[1,1]]
    1 [[1,1]]
    2 [[1,1],[1,1],[1,1]]

I want a new dataframe that has two columns. The first has the first item in each basic list, the second column has the second item. I solved it this way:

df = pd.DataFrame(data = {'A': [[[1,1],[1,1]], [[1,1]], [[1,1],[1,1],[1,1]]]})

df2 = pd.DataFrame(columns = ['A', 'B'])
for x in df.A:
    for i in x:
        n = pd.DataFrame([i], columns = ('A', 'B'))
        df2 = df2.append(n)
   A  B
0  1  1
0  1  1
0  1  1
0  1  1
0  1  1
0  1  1 

I know it is not good to loop through a dataframe, but I couldn't figure out how. Here are some failed attempts:

for x in df1:
     df2 = [df2.append(pd.DataFrame([i], columns = ('A', 'B'))) for i in x]

df2 = df1.apply(lambda x: df2.append(pd.DataFrame([x[0]], columns = ['name', 'tid'])))

If I had got the first list comprehension to work I would have tried to move the for loop to the end of the first list comprehension.

Thank you in advance for your help!


Solution

  • does this do the trick?

    import pandas as pd
    import itertools
    
    df = pd.DataFrame(data = {'A': [[[1,1],[1,1]], [[1,1]], [[1,1],[1,1],[1,1]]]})
    
    a = []
    b = []
    for k in range(len(df)):
        a.append([x[0] for x in df.iloc[k].A])
        b.append([x[1] for x in df.iloc[k].A])
    
    df2 = df2 = pd.DataFrame(data = {'A': list(itertools.chain(*a)), 'B': list(itertools.chain(*b))})
    

    Result:

    >>> df2
       A  B
    0  1  1
    1  1  1
    2  1  1
    3  1  1
    4  1  1
    5  1  1