Search code examples
pythonpython-polars

How to create a polars data frame from a dictionary which has unequal length values?


I have a dictionary as:

ex_dict = {'A': ['false',
  'true',
  'false',
  'false',
  'false',
  'true',
  'true',
  'false',
  'false'],
 'B': ['false',
  'false',
  'true',
  'false',
  'false',
  'false'],
  'C': ['false',
  'true',
  'true',
  'false',
  'false',
  'false',
  'false',
  'false',
  'true']}

I'm creating a dataframe as:

pl.DataFrame(ex_dict)

on executing it gives an error as:

ShapeError: Could not create a new DataFrame from Series. The Series have different lengths.Got [shape: (9,)

How to create a polars dataframe in these scenarios ?


Solution

  • You can place each Series into its own DataFrame, and use a concat with how="horizontal". This will automatically extend shorter Series with null values.

    pl.concat(
        items=[pl.DataFrame({name: values})
               for name, values in ex_dict.items()],
        how="horizontal",
    )
    
    shape: (9, 3)
    ┌───────┬───────┬───────┐
    │ A     ┆ B     ┆ C     │
    │ ---   ┆ ---   ┆ ---   │
    │ str   ┆ str   ┆ str   │
    ╞═══════╪═══════╪═══════╡
    │ false ┆ false ┆ false │
    │ true  ┆ false ┆ true  │
    │ false ┆ true  ┆ true  │
    │ false ┆ false ┆ false │
    │ false ┆ false ┆ false │
    │ true  ┆ false ┆ false │
    │ true  ┆ null  ┆ false │
    │ false ┆ null  ┆ false │
    │ false ┆ null  ┆ true  │
    └───────┴───────┴───────┘