Create unique id column for each pair of (col_x, col_y) in polars Python

I have a polars dataframe with subject_id, timestamp, event, col1, and col2 columns.

I want to split this dataframe into two polars dataframe (one with subject_id, timestamp, event and one with subject_id, timestamp, col1, col2), but create a column for a unique id before splitting such that I can use that id to join the split dataframes after grouping/manipulating separately.

How can I create this unique id column in polars where there is a unique id for every unique subject_id, timestamp pair in the dataframe before splitting?

Essentially, I wish to do what this post provided, but in Polars. I understand Polars does not have indexes, so what is the best approach?

Solution

Looks like I just had to do a bit more digging - it's helpful to try to find a solution in pandas first then try to replicate using polars. Answer from this post:

(
    # Add row index.
    df.with_row_index()
    # Group on id and cat column. 
    .group_by(
        ["id", "cat"],
        maintain_order=True,
    )
    .agg(
        # Create a list of all index positions per group.
        pl.col("index")
    )
    # Add a new row count for each group.
    .with_row_index("ngroup")
    # Expand index list column to separate rows.
    .explode("index")
    # Reorder columns.
    .select("index", "ngroup", "id", "cat")
    # Optionally sort by original order.
    .sort("index")
)