Corr of one column with all other numeric ones

Starting with

import polars as pl
df = pl.DataFrame({
    'a': [1,2,3],
    'b': [4.,2.,6.],
    'c': ['w', 'a', 'r'],
    'd': [4, 1, 1]
})

how can I get the correlation between a and all other numeric columns?

Equivalent in pandas:

In [30]: (
    ...:     pd.DataFrame({
    ...:         'a': [1,2,3],
    ...:         'b': [4.,2.,6.],
    ...:         'c': ['w', 'a', 'r'],
    ...:         'd': [4, 1, 1]
    ...:     })
    ...:     .corr()
    ...:     .loc['a']
    ...: )
Out[30]:
a    1.000000
b    0.500000
d   -0.866025
Name: a, dtype: float64

I've tried

(
    df.select(pl.col(pl.Int64).cast(pl.Float64), pl.col(pl.Float64))
    .select(pl.corr('a', pl.exclude('a')))
)

but got

DuplicateError: the name 'a' is duplicate

Solution

There is a DataFrame.corr() which you could then filter.

df.select(
    pl.col(pl.Int64).cast(pl.Float64), 
    pl.col(pl.Float64)
).corr()

shape: (3, 3)
┌───────────┬───────────┬─────┐
│ a         ┆ d         ┆ b   │
│ ---       ┆ ---       ┆ --- │
│ f64       ┆ f64       ┆ f64 │
╞═══════════╪═══════════╪═════╡
│ 1.0       ┆ -0.866025 ┆ 0.5 │
│ -0.866025 ┆ 1.0       ┆ 0.0 │
│ 0.5       ┆ 0.0       ┆ 1.0 │
└───────────┴───────────┴─────┘