Update: This was fixed by pull/5837
shape: (3, 3)
┌─────────────────────┬─────────┬──────────────┐
│ dt ┆ seconds ┆ duration0 │
│ --- ┆ --- ┆ --- │
│ datetime[μs] ┆ f64 ┆ duration[μs] │
╞═════════════════════╪═════════╪══════════════╡
│ 2022-12-14 00:00:00 ┆ 1.0 ┆ 1µs │
│ 2022-12-14 00:00:00 ┆ 2.2 ┆ 2µs │
│ 2022-12-14 00:00:00 ┆ 2.4 ┆ 2µs │
└─────────────────────┴─────────┴──────────────┘
I want to add a duration in seconds to a date/time. My data looks like
import polars as pl
df = pl.DataFrame(
{
"dt": [
"2022-12-14T00:00:00", "2022-12-14T00:00:00", "2022-12-14T00:00:00",
],
"seconds": [
1.0, 2.2, 2.4,
],
}
)
df = df.with_columns(pl.col("dt").cast(pl.Datetime))
Now my naive attempt was to to convert the float column to duration type to be able to add it to the datetime column (as I would do in pandas
).
df = df.with_columns(pl.col("seconds").cast(pl.Duration).alias("duration0"))
print(df.head())
┌─────────────────────┬─────────┬──────────────┐
│ dt ┆ seconds ┆ duration0 │
│ --- ┆ --- ┆ --- │
│ datetime[μs] ┆ f64 ┆ duration[μs] │
╞═════════════════════╪═════════╪══════════════╡
│ 2022-12-14 00:00:00 ┆ 1.0 ┆ 0µs │
│ 2022-12-14 00:00:00 ┆ 2.2 ┆ 0µs │
│ 2022-12-14 00:00:00 ┆ 2.4 ┆ 0µs │
└─────────────────────┴─────────┴──────────────┘
...gives the correct data type, however the values are all zero.
The documentation is kind of sparse on the topic, any better options?
Update: The values being zero is a repr formatting issue that has been fixed with this commit.
pl.duration()
can be used in this way:
df.with_columns(
pl.col("dt").str.to_datetime()
+ pl.duration(nanoseconds=pl.col("seconds") * 1e9)
)
shape: (3, 2)
┌─────────────────────────┬─────────┐
│ dt ┆ seconds │
│ --- ┆ --- │
│ datetime[μs] ┆ f64 │
╞═════════════════════════╪═════════╡
│ 2022-12-14 00:00:01 ┆ 1.0 │
│ 2022-12-14 00:00:02.200 ┆ 2.2 │
│ 2022-12-14 00:00:02.400 ┆ 2.4 │
└─────────────────────────┴─────────┘