I'm trying to use nodejs-polars
library but have encountered a problem converting a javascript object to a polars dataframe.
Consider the following data for example
const myData = [
{
"id": "a",
"name": "fred",
"country": "france",
"age": 30,
"city": "paris" // there's no "city" property elsewhere in `myData`
},
{
"id": "b",
"name": "alexandra",
"country": "usa",
"age": 40
},
{
"id": "c",
"name": "george",
"country": "argentina",
"age": 50
}
]
So if we do
const pl = require("nodejs-polars")
const output = pl.DataFrame(myData)
We get the error:
Error: Lengths don't match: Could not create a new DataFrame from Series. The Series have different lengths
Is there no way to create a polars dataframe from object such that it will automatically populate missing values with null
?
You are looking for readRecords
:
const df = pl.readRecords(myData, {inferSchemaLength: 10})
Nested data types have limited support.
Old answer:
This can be achieved with pl.readJSON
if you don't mind doing a small amount of pre-processing.
const ndJSONData = myData
.map(row => JSON.stringify(row))
.join("\n")
// -1 will do a full scan, set to a length that best fits your use case.
const df = pl.readJSON(ndJSONData, {"inferSchemaLength": -1})