Search code examples
dataframerustrust-polars

How to get an item in a polars dataframe column and put it back into the same column at a different location


Still new to polars and rust ... so here is a nooby question:

How do I access a value within a DataFrame at a specific location.

How do I overwrite a value within a DataFrame at a specific location.

Here is a NON-WORKING code:

use polars::prelude::*;

fn main() {
    let df = df! [
        "STOCK"         => ["TSLA", "META", "AA",],
        "STRIKES"       => [10, 20, 5],

    ]
    .unwrap();

    println!("df\t{:?}", df);

    // Take TSLA's STRIKE (10)
    let tsla_strike = df
        .lazy()
        .filter((col("STOCK") == lit("TSLA")))
        .with_column(col("STRIKES"))
        .first()
        .collect();

    let o_i32 = GetOutput::from_type(DataType::Int32);

    // Overwrite AA's STRIKE with tsla_strike (5 ==> 10)
    let df = df
        .lazy()
        .filter((col("STOCK") == lit("AA")).into())
        .with_column(col("STRIKES").map(|x| tsla_strike,o_i32))
        .collect()
        .unwrap();

    println!("df\t{:?}", df);
}

Here is the result I like to get:

RESULT:

df      shape: (3, 2)
┌───────┬─────────┐
│ STOCK ┆ STRIKES │
│ ---   ┆ ---     │
│ str   ┆ i32     │
╞═══════╪═════════╡
│ TSLA  ┆ 10      │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
│ META  ┆ 20      │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
│ AA    ┆ 10      │
└───────┴─────────┘

An antipattern way to do it, is to traverse the DF and then at the same time build a new DF with the desired values.


Solution

  • You can use the when -> then -> otherwise construct. When STOCK=="AA" then take the STRIKE where STOCK=="TSLA", otherwise just take the STRIKE. This construct is vectorized and fast (it does not operate on the single elements).

    let df2 = df
        .lazy()
        .clone()
        .select([
            col("STOCK"),
            when(col("STOCK").eq(lit("AA")))
                .then(col("STRIKES").filter(col("STOCK").eq(lit("TSLA"))))
                .otherwise(col("STRIKES"))
        ])
        .collect()?;
    

    Another option in case you have a lot of mappings to do would be a mapping data frame and left joining the replacement values.

    let mapping = df! [
        "ORIGINAL_STOCK"    => ["TSLA", "AA"],
        "REPLACEMENT_STOCK" => ["AA", "META"]
    ]?;
    
    let df2 = df
        .clone()
        .lazy()
        .join(mapping.clone().lazy(), [col("STOCK")], [col("ORIGINAL_STOCK")], JoinType::Left)
        .join(df.clone().lazy(), [col("REPLACEMENT_STOCK")], [col("STOCK")], JoinType::Left)
        .select([
            col("STOCK"),
            when(col("STRIKES_right").is_not_null())
               .then(col("STRIKES_right"))
               .otherwise(col("STRIKES"))
            .alias("STRIKES")
        ])
        .collect()?;