Search code examples
pythondataframepython-polarsrust-polars

Repeating a date in polars and exploding it


I have a polars dataframe with two date columns that represent a start and end date and then a value that I want to repeat for all dates in between those two dates so that I can join those on other tables.

Example input is

id start end value
123 2022-01-01 2022-01-04 10
abc 2022-03-04 2022-03-04 3
456 2022-05-11 2022-05-16 4

and expected output is

id date value
123 2022-01-01 10
123 2022-01-02 10
123 2022-01-03 10
123 2022-01-04 10
abc 2022-03-04 3
456 2022-05-11 4
456 2022-05-12 4
456 2022-05-13 4
456 2022-05-14 4
456 2022-05-15 4
456 2022-05-16 4

Solution

  • I struggled today with the same problem and I thought I share my solution.

    As cbilot already mentions pl.dat_range doesn't take expressions as low and high value. So I worked around by using apply.

    Data:

    import polars as pl
    from datetime import date
    
    df = pl.DataFrame(
        {
            "id": ["123", "abc", "456"],
            "start": [date(2022, 1, 1), date(2022, 3, 4), date(2022, 5, 11)],
            "end": [date(2022, 1, 4), date(2022, 3, 4), date(2022, 5, 16)],
            "value": [10, 3, 4],
        }
    )
    

    Solution:

    (
        df.with_columns(
            [(pl.struct(["start", "end"])
                .apply(lambda x: pl.date_range(x["start"], x["end"], "1d"))
                .alias("date"))])
        .explode(pl.col("date"))
        .select(["id", "date", "value"])
    )
    
    
    shape: (11, 3)
    ┌─────┬────────────┬───────┐
    │ id  ┆ date       ┆ value │
    │ --- ┆ ---        ┆ ---   │
    │ str ┆ date       ┆ i64   │
    ╞═════╪════════════╪═══════╡
    │ 123 ┆ 2022-01-01 ┆ 10    │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ 123 ┆ 2022-01-02 ┆ 10    │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ 123 ┆ 2022-01-03 ┆ 10    │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ 123 ┆ 2022-01-04 ┆ 10    │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ ... ┆ ...        ┆ ...   │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ 456 ┆ 2022-05-13 ┆ 4     │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ 456 ┆ 2022-05-14 ┆ 4     │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ 456 ┆ 2022-05-15 ┆ 4     │
    ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ 456 ┆ 2022-05-16 ┆ 4     │
    └─────┴────────────┴───────┘