transform json to polars dataframe

i have the following json file and i would like to transform to a dataframe Polars. How can I use the pl.read_json function that have schema attribute?

    {
        "data": {
            "names": [
                "A",
                "B",
                "C",
                "D",
                "E"
            ],
            "ndarray": [
                [
                    "abc",
                    true,
                    0.374618,
                    1,
                    0.83252
                ],
                [
                    "hello",
                    false,
                    0.1265374619,
                    0,
                    0.253
                ]
            ]
        }
    }

Solution

I'm not sure if you can use pl.read_json with a file structured in that way.

The issue is ndarray contains "mixed types" which is not allowed in Polars.

[
    "abc",      # str
    true,       # bool
    0.374618,   # float
    1,          # int
    0.83252     # float
]

Polars must choose a single type, e.g. in this case str is chosen as the "supertype":

pl.select(pl.lit("""["abc", true, 1.23]""").str.json_decode())

shape: (1, 1)
┌─────────────────────────┐
│ literal                 │
│ ---                     │
│ list[str]               │
╞═════════════════════════╡
│ ["abc", "true", "1.23"] │
└─────────────────────────┘

And there's no way to access the "original" type information.

If you load the JSON first, outside of Polars (e.g. using the json module) you can use pl.DataFrame() directly.

import json

with open("data.json") as f:
    data = json.load(f)["data"]
    df = pl.DataFrame(data["ndarray"], schema=data["names"])

shape: (2, 5)
┌───────┬───────┬──────────┬─────┬─────────┐
│ A     ┆ B     ┆ C        ┆ D   ┆ E       │
│ ---   ┆ ---   ┆ ---      ┆ --- ┆ ---     │
│ str   ┆ bool  ┆ f64      ┆ i64 ┆ f64     │
╞═══════╪═══════╪══════════╪═════╪═════════╡
│ abc   ┆ true  ┆ 0.374618 ┆ 1   ┆ 0.83252 │
│ hello ┆ false ┆ 0.126537 ┆ 0   ┆ 0.253   │
└───────┴───────┴──────────┴─────┴─────────┘