Search code examples
pythonpandaspandas-apply

Unpacking many columns of lists using apply get ValueError: If using all scalar values, you must pass an index


I want to unpack multiple columns of lists into many more columns. Basically this but for multiple columns of lists rather than just one, and avoiding for loops.

As an example I have a pandas.DataFrame,

import pandas as pd

tst = pd.DataFrame({'A': [[1, 2]]* 5, 'B': [[3, 4]]* 5, 'C': [[5, 6]] * 5})

I can easily unpack one of the columns e.g. A into multiple columns,

pd.DataFrame(tst['A'].to_list(), 
             columns=['1' + tst['A'].name, '2' + tst['A'].name],
             index=list(range(tst['A'].shape[0]))
            )

However when I tried expanding this to multiple columns using .apply to avoid a for loop,

tst.apply(
    lambda x: pd.DataFrame(x.to_list(), 
                           columns=['1' + x.name, '2' + x.name], 
                           index=list(range(x.shape[0]))
                          )
)

I get the below error, however I am supplying an index...

ValueError: If using all scalar values, you must pass an index

Is there a way to fix this so that I get an output as per below? (column order doesn't matter)

    1C  2C  1B  2B  1A  2A
0   5   6   3   4   1   2
1   5   6   3   4   1   2
2   5   6   3   4   1   2
3   5   6   3   4   1   2
4   5   6   3   4   1   2

pd.__version__ == '1.0.5'


Solution

  • You can horizontally stack the columns, then create a new dataframe and rename the columns:

    df = pd.DataFrame(np.hstack(tst.values.T.tolist()))
    df.columns = [f'{i}{c}' for c in tst for i in range(1,3)]
    

    Alternatively you can concat along axis=1:

    df = pd.concat([pd.DataFrame(tst[c].tolist()) for c in tst], axis=1)
    df.columns = [f'{i}{c}' for c in tst for i in range(1,3)]
    

    print(df)
    
       1A  2A  1B  2B  1C  2C
    0   1   2   3   4   5   6
    1   1   2   3   4   5   6
    2   1   2   3   4   5   6
    3   1   2   3   4   5   6
    4   1   2   3   4   5   6