Search code examples
pythonpython-3.xpandasdataframescalable

Scalable approach to make values in a list as column values in a dataframe in pandas in Python


I have a pandas dataframe which has only one column, the value of each cell in the column is a list/array of numbers, this list is of length 100 and this length is consistent across all the cell values.

We need to convert each list value as a column value, in other words have a dataframe which has 100 columns and each column value is at a list/array item.

Something like this enter image description here

becomes enter image description here

It can be done with iterrows() as shown below, but we have around 1.5 million rows and need a scalable solution as iterrows() would take alot of time.

cols = [f'col_{i}' for i in range(0, 4)]
df_inter = pd.DataFrame(columns = cols)
for index, row in df.iterrows():
    df_inter.loc[len(df_inter)] = row['message']

Solution

  • You can do this:

    In [28]: df = pd.DataFrame({'message':[[1,2,3,4,5], [3,4,5,6,7]]})
    
    In [29]: df
    Out[29]: 
               message
    0  [1, 2, 3, 4, 5]
    1  [3, 4, 5, 6, 7]
    
    In [30]: res = pd.DataFrame(df.message.tolist(), index= df.index)
    
    In [31]: res
    Out[31]: 
       0  1  2  3  4
    0  1  2  3  4  5
    1  3  4  5  6  7