I have a polars DataFrame for example:
>>> df = pl.DataFrame({'A': ['a', 'b', 'c', 'd'], 'B': ['app', 'nop', 'cap', 'tab']})
>>> df
shape: (4, 2)
┌─────┬─────┐
│ A ┆ B │
│ --- ┆ --- │
│ str ┆ str │
╞═════╪═════╡
│ a ┆ app │
│ b ┆ nop │
│ c ┆ cap │
│ d ┆ tab │
└─────┴─────┘
I'm trying to get a third column C
which is True
if strings in column B
starts with the strings in column A
of the same row, otherwise, False
. So in the case above, I'd expect:
┌─────┬─────┬───────┐
│ A ┆ B ┆ C │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ bool │
╞═════╪═════╪═══════╡
│ a ┆ app ┆ true │
│ b ┆ nop ┆ false │
│ c ┆ cap ┆ true │
│ d ┆ tab ┆ false │
└─────┴─────┴───────┘
I'm aware of the df['B'].str.starts_with()
function but passing in a column yielded:
>>> df['B'].str.starts_with(pl.col('A'))
... # Some stuff here.
TypeError: argument 'sub': 'Expr' object cannot be converted to 'PyString'
What's the way to do this? In pandas, you would do:
df.apply(lambda d: d['B'].startswith(d['A']), axis=1)
Expression support was added for .str.starts_with()
in pull/6355 as part of the Polars 0.15.17 release.
df.with_columns(pl.col("B").str.starts_with(pl.col("A")).alias("C"))
shape: (4, 3)
┌─────┬─────┬───────┐
│ A | B | C │
│ --- | --- | --- │
│ str | str | bool │
╞═════╪═════╪═══════╡
│ a | app | true │
│ b | nop | false │
│ c | cap | true │
│ d | tab | false │
└─────┴─────┴───────┘