I am very new to Rust so please excuse me if this is a trivial question.
I am trying to filter a dataframe as follows:
let allowed = Series::from_iter(vec![
"string1".to_string(),
"string2".to_string(),
]);
let df = LazyCsvReader::new(&fullpath)
.has_header(true)
.finish().unwrap()
.filter(col("string_id").is_in(&allowed)).collect().unwrap();
It looks good to me since the signature of the is_in
method looks like this:
fn is_in(
&self,
_other: &Series
) -> Result<ChunkedArray<BooleanType>, PolarsError>
from [https://docs.rs/polars/latest/polars/series/trait.SeriesTrait.html#method.is_in]
However, when I compile it I get the following error:
error[E0277]: the trait bound `Expr: From<&polars::prelude::Series>` is not satisfied
--> src/main.rs:33:40
|
33 | .filter(col("string_id").is_in(&allowed)).collect().unwrap();
| ----- ^^^^^^^^ the trait `From<&polars::prelude::Series>` is not implemented for `Expr`
| |
| required by a bound introduced by this call
|
= help: the following other types implement trait `From<T>`:
<Expr as From<&str>>
<Expr as From<AggExpr>>
<Expr as From<bool>>
<Expr as From<f32>>
<Expr as From<f64>>
<Expr as From<i32>>
<Expr as From<i64>>
<Expr as From<u32>>
<Expr as From<u64>>
= note: required for `&polars::prelude::Series` to implement `Into<Expr>`
note: required by a bound in `polars_plan::dsl::<impl Expr>::is_in`
--> /home/myself/.cargo/registry/src/
|
1393 | pub fn is_in<E: Into<Expr>>(self, other: E) -> Self {
| ^^^^^^^^^^ required by this bound in `polars_plan::dsl::<impl Expr>::is_in`
For more information about this error, try `rustc --explain E0277`.
To me this error looks very cryptic. I read the result of rustc --explain E0277
that says "You tried to use a type which doesn't implement some trait in a place which
expected that trait", but this doesn't help in the slightest to identify which type doesn't implement which trait.
NOTE:
I know that writing lit(allowed)
instead of &allowed
works, but this is not possible because it prevents using allowed
anywhere else.
For example, I would like to do the following, but the following code gets (obviously) an error "use of moved value":
let df = LazyCsvReader::new(&fullpath)
.has_header(true)
.finish().unwrap()
.with_column(
when(
col("firstcolumn").is_in(lit(allowed))
.and(
col("secondcolumn").is_in(lit(allowed))
)
)
.then(lit("very good"))
.otherwise(lit("very bad"))
.alias("good_bad")
)
.collect().unwrap();
Bonus questions:
lit(allowed)
? Shouldn't I pass the variable by reference as specified in the documentation?is_in
like in the example above without having an error?EDIT:
I found a different signature for is_in
requiring the second parameter to be a Expr, this would justify the need to use lit
. However, it's still not clear how to use the same Series multiple times without getting the borrowed value error..
The signature is for Series.is_in()
but you're using Expr.is_in() which differs.
You can use cols()
to select multiple columns:
.with_columns([
cols(["firstcolumn", "secondcolumn"]).is_in(lit(allowed))
])
┌─────────────┬──────────────┬─────────────┐
│ firstcolumn ┆ secondcolumn ┆ thirdcolumn │
│ --- ┆ --- ┆ --- │
│ bool ┆ bool ┆ str │
╞═════════════╪══════════════╪═════════════╡
│ false ┆ false ┆ moo │
│ true ┆ false ┆ foo │
│ true ┆ true ┆ keepme │
│ true ┆ true ┆ andme │
└─────────────┴──────────────┴─────────────┘
Used inside .when()
- there is an implicit AND
┌─────────────┬──────────────┬─────────────┬───────────┐
│ firstcolumn ┆ secondcolumn ┆ thirdcolumn ┆ good_bad │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str │
╞═════════════╪══════════════╪═════════════╪═══════════╡
│ a ┆ b ┆ moo ┆ very bad │
│ string1 ┆ no ┆ foo ┆ very bad │
│ string2 ┆ string1 ┆ keepme ┆ very good │
│ string1 ┆ string2 ┆ andme ┆ very good │
└─────────────┴──────────────┴─────────────┴───────────┘
With regards to the moved value error - I have little rust knowledge but the compiler tells me:
help: consider cloning the value if the performance cost is acceptable
|
15 | col("firstcolumn").is_in(lit(allowed.clone())).and(col("secondcolumn").is_in(lit(allowed))))
| ++++++++