Search code examples
pythonpython-polars

How to use a polars column with offset string to add to another date column


Suppose you have

df=pl.DataFrame(
{
    "date":["2022-01-01", "2022-01-02"],
    "hroff":[5,2],
    "minoff":[1,2]
}).with_columns(pl.col('date').str.to_date())

and you want to make a new column that adds the hour and min offsets to the date column. The only thing I saw was the dt.offset_by method. I made an extra column

df=df.with_columns(pl.format('{}h{}m','hroff','minoff').alias('offset'))

and then tried

df.with_columns(pl.col('date') \
               .cast(pl.Datetime).dt.convert_time_zone('UTC') \
               .dt.offset_by(pl.col('offset')).alias('newdate'))

but that doesn't work because dt.offset_by only takes a fixed string, not another column.

What's the best way to do that?


Solution

  • Use pl.duration:

    import polars as pl
    
    df = pl.DataFrame({
        "date": pl.Series(["2022-01-01", "2022-01-02"]).str.to_date(),
        "hroff": [5, 2],
        "minoff": [1, 2]
    })
    
    print(df.select(
        pl.col("date") + pl.duration(hours=pl.col("hroff"), minutes=pl.col("minoff"))
    ))
    
    shape: (2, 1)
    ┌─────────────────────┐
    │ date                │
    │ ---                 │
    │ datetime[μs]        │
    ╞═════════════════════╡
    │ 2022-01-01 05:01:00 │
    │ 2022-01-02 02:02:00 │
    └─────────────────────┘