Search code examples
pythonpython-polars

How to effectively create duplicate rows in polars?


I am trying to transfer my pandas code into polars but I have a difficulties with duplicating lines (I need it for my pyvista visualizations). In pandas I did the following:

df = pd.DataFrame({ "key": [1, 2, 3], "value": [4, 5, 6] })

df["key"] = df["key"].apply(lambda x: 2*[x])
df = df.explode("key", 
        ignore_index=False
)

In polars I tried

df = pl.DataFrame({ "key": [1, 2, 3], "value": [4, 5, 6] })

df.with_columns(
    (pl.col("key").map_elements(lambda x: [x]*2))
    .explode()
)

but it raises:

ShapeError: unable to add a column of length 6 to a DataFrame of height 3

I also tried to avoid map_elements using

df.with_columns(
    (pl.col("key").cast(pl.List(float))*2)
    .explode()
)

but it only raises:

InvalidOperationError: can only do arithmetic operations on Series of the same size; got 3 and 1

Any idea how to do this?


Solution

  • You can use .repeat_by() and .flatten()

    df = pl.DataFrame({ "key": [1, 2, 3], "value": [4, 5, 6] })
    
    df.select(pl.all().repeat_by(2).flatten())
    
    shape: (6, 2)
    ┌─────┬───────┐
    │ key ┆ value │
    │ --- ┆ ---   │
    │ i64 ┆ i64   │
    ╞═════╪═══════╡
    │ 1   ┆ 4     │
    │ 1   ┆ 4     │
    │ 2   ┆ 5     │
    │ 2   ┆ 5     │
    │ 3   ┆ 6     │
    │ 3   ┆ 6     │
    └─────┴───────┘