Convert pandas dataframe to sparse matrix

I have a pandas dataframe e.g.

data = {'col_1': ['a', 'b'], 'col_2': ['b', 'c']}
df = pd.DataFrame(data)

I want to convert this to a sparse representation of the data in numpy e.g.

[[[1, 0], [0, 0]], [[0, 1], [1, 0]], [[0, 0], [0, 1]]]

where the each 2x2 matrix represents the occurences of 'a', 'b' and 'c' in my pandas dataframe.

I can achieve the desired outcome through some messy operations:

boolean_matrix = pd.get_dummies(df, prefix='', prefix_sep='').groupby(level=0, axis=1).sum()

boolean_matrix = boolean_matrix.values.tolist()
boolean_matrix = [[[int(i == j) for j in range(len(boolean_matrix[0]))] for i in row] for row in boolean_matrix]

print(boolean_matrix)

But I can't believe this is the standard way to do what is probably a pretty common operation, are there any inbuild methods (pandas, polars, numpy, tensorflow) that will do this?

Solution

Let's use numpy broadcasting and unique:

out = (df.to_numpy() == np.unique(df)[:,None,None]).astype(int)

Or, for a specific order:

out = (df.to_numpy() == np.array(['a', 'b', 'c'])[:,None,None]).astype(int)

Output:

array([[[1, 0],
        [0, 0]],

       [[0, 1],
        [1, 0]],

       [[0, 0],
        [0, 1]]])