Search code examples
pythonpandasdataframexlw

Handling Variable Number of Columns Dataframe - Python


I am trying to write a list of lists into an excel sheet using pandas the list looks like:

List_of Lists = [ [1,2,3,4],
                  [5,6,7,8],
                  [9,10,11,12],
                  ........,
                ]

The number of these lists inside the main list could go up to a 1000. I also want to label them like colums1, colomns2, until colums100 for instance. on the same sheets. can anyone familiar with pandas help me? as this could be really easy for some?


Solution

  • I believe you can just pass the list into pd.DataFrame() and you will just get NaNs for the values that don't exist.

    For example:

    List_of_Lists = [[1,2,3,4],
                     [5,6,7],
                     [9,10],
                     [11]]
    df = pd.DataFrame(List_of_Lists)
    print(df)
        0     1    2    3
    0   1   2.0  3.0  4.0
    1   5   6.0  7.0  NaN
    2   9  10.0  NaN  NaN
    3  11   NaN  NaN  NaN
    

    Then to get the naming the way you want just use pandas.DataFrame.add_prefix

    df = df.add_prefix('Column')
    print(df)
       Column0  Column1  Column2  Column3
    0        1      2.0      3.0      4.0
    1        5      6.0      7.0      NaN
    2        9     10.0      NaN      NaN
    3       11      NaN      NaN      NaN
    

    Now I guess there is the possibility that you also could want each list to be a column. In that case you need to transpose your List_of_Lists.

    from itertools import zip_longest
    
    df = pd.DataFrame(list(map(list, zip_longest(*List_of_Lists))))
    print(df)
       0    1     2     3
    0  1  5.0   9.0  11.0
    1  2  6.0  10.0   NaN
    2  3  7.0   NaN   NaN
    3  4  NaN   NaN   NaN