I'm trying to truncate floating point numbers in my DataFrame to a desired number of decimal places. I've found that this can be done using Pandas and NumPy here, but I've also seen that it might be possible with polars.Config.set_float_precision
.
Below is my current approach, but I think I might be taking extra steps.
import polars as pl
data = {
"name": ["Alice", "Bob", "Charlie"],
"grade": [90.23456, 80.98765, 85.12345],
}
df = pl.DataFrame(data)
(
df
# Convert to string
.with_columns(
pl.col("grade").map_elements(
lambda x: f"{x:.5f}",
return_dtype=pl.String
).alias("formatted_grade")
)
# Slice to get desired decimals
.with_columns(
pl.col("formatted_grade").str.slice(0, length = 4)
)
# Convert back to Float
.with_columns(
pl.col("formatted_grade").cast(pl.Float64)
)
)
You can use the Polars - Numpy integration like this:
df = df.with_columns(truncated_grade=np.trunc(pl.col("grade") * 10) / 10)
Output:
┌─────────┬──────────┬─────────────────┐
│ name ┆ grade ┆ truncated_grade │
│ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ f64 │
╞═════════╪══════════╪═════════════════╡
│ Alice ┆ 90.23456 ┆ 90.2 │
│ Bob ┆ 80.98765 ┆ 80.9 │
│ Charlie ┆ 85.12345 ┆ 85.1 │
└─────────┴──────────┴─────────────────┘
Full code:
import numpy as np
import polars as pl
data = {
"name": ["Alice", "Bob", "Charlie"],
"grade": [90.23456, 80.98765, 85.12345],
}
df = pl.DataFrame(data)
df = df.with_columns(truncated_grade=np.trunc(pl.col("grade") * 10) / 10)
print(df)