Search code examples
pythondataframepython-polars

Polars Conditional Replacement From Another DataFrame


I have two DataFrames like this.

df1 = pl.DataFrame({
  "col_1": np.random.rand(),
  "col_2": np.random.rand(),
  "col_3": np.random.rand()
})
┌──────────┬─────────┬──────────┐
│ col_1    ┆ col_2   ┆ col_3    │
│ ---      ┆ ---     ┆ ---      │
│ f64      ┆ f64     ┆ f64      │
╞══════════╪═════════╪══════════╡
│ 0.534349 ┆ 0.84115 ┆ 0.526435 │
└──────────┴─────────┴──────────┘
df2 = pl.DataFrame({
    "col_1": np.random.randint(0, 2, 5),
    "col_2": np.random.randint(0, 2, 5),
    "col_3": np.random.randint(0, 2, 5)
})
┌───────┬───────┬───────┐
│ col_1 ┆ col_2 ┆ col_3 │
│ ---   ┆ ---   ┆ ---   │
│ i64   ┆ i64   ┆ i64   │
╞═══════╪═══════╪═══════╡
│ 0     ┆ 0     ┆ 0     │
│ 0     ┆ 1     ┆ 0     │
│ 1     ┆ 1     ┆ 1     │
│ 1     ┆ 1     ┆ 0     │
│ 1     ┆ 1     ┆ 1     │
└───────┴───────┴───────┘

I want to replace the 1s in the second DataFrame with the corresponding value in the 2nd DataFrame. And the zeros should be replaced with 1s. Resulting in this:

┌──────────┬─────────┬──────────┐
│ col_1    ┆ col_2   ┆ col_3    │
│ ---      ┆ ---     ┆ ---      │
│ f64      ┆ f64     ┆ f64      │
╞══════════╪═════════╪══════════╡
│ 1.0      ┆ 1.0     ┆ 1.0      │
│ 1.0      ┆ 0.84115 ┆ 1.0      │
│ 0.534349 ┆ 0.84115 ┆ 0.526435 │
│ 0.534349 ┆ 0.84115 ┆ 1.0      │
│ 0.534349 ┆ 0.84115 ┆ 0.526435 │
└──────────┴─────────┴──────────┘

I tried reshaping df1 to have the same height as df2, like this:

df1 = df1.select(pl.all().repeat_by(df2.height).arr.explode())

And if I rename the columns so they're not the same, I could horizontally concatenate the 2 DataFrames using pl.concat. But I'm unsure where to go from there. How could I achieve this? Or is there a better approach?


Solution

  • You could just build multiple expressions from each column name:

    df2.select(
       pl.when(pl.col(col) == 1)
         .then(df1.get_column(col).item())
         .otherwise(1)
         .alias(col)
       for col in df2.columns
    )
    
    shape: (5, 3)
    ┌──────────┬─────────┬──────────┐
    │ col_1    ┆ col_2   ┆ col_3    │
    │ ---      ┆ ---     ┆ ---      │
    │ f64      ┆ f64     ┆ f64      │
    ╞══════════╪═════════╪══════════╡
    │ 1.0      ┆ 1.0     ┆ 1.0      │
    │ 1.0      ┆ 0.84115 ┆ 1.0      │
    │ 0.534349 ┆ 0.84115 ┆ 0.526435 │
    │ 0.534349 ┆ 0.84115 ┆ 1.0      │
    │ 0.534349 ┆ 0.84115 ┆ 0.526435 │
    └──────────┴─────────┴──────────┘