Search code examples
pythonpython-polars

Ways of creating a struct directly?


I know I can create a Polars struct "scalar" indirectly using dictionaries as elements when building a series. But is there any way in which I can directly create a Polars struct scalar directly? (Not in a series, or a dataframe.)

For reasons I am not clear on, people think this question might be similar to Ways of creating a `pyarrow.StructScalar` directly?.

It is not, because:

  • while Polars uses Arrow under the hood, it only provides functionality for translating Arrow arrays into Polars Series, and Arrow tables into Polars DataFrames --- there is no functionality for converting Arrow scalars into Polars scalars
  • because Polars uses Arrow under the hood, we can forget about the fact that Polars uses Arrow when attempting to answer the question: "how can we create a Polars scalar?"
  • finally, note that while the other question had a relatively quick solution, this one does not!

Solution

  • AFAIK, this is not possible. Polars datatypes are used to type contents of polars dataframes and series. You cannot have data of a polars dtype outside of a datafame and series. Especially, the contents of a struct column would be a regular dictionary.

    Consider the following example. We create a polars dataframe with a single struct column "my_structs" (with fields "int_field" and "float_field") and a single row of data.

    import polars as pl
    
    df = pl.DataFrame({
        "my_structs": {"int": 1, "float": 1.0}
    })
    

    Now, if we select the struct column and take the single item it stores, the result will be a regular python dictionary.

    df.get_column("my_structs").item()
    
    {'int_field': 1, 'float_field': 1.0}