I am attempting to apply the gower_matrix
function from the gower
package to the values of a dictionary using this chunk of code:
import gower
import pandas as pd
from itertools import chain, combinations
from pydataset import data
from toolz.dicttoolz import valmap
cars = data('mtcars')
vnames=cars.columns
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(1,len(s)+1))
combos=list(powerset(vnames))
combos=list(map(list, list(powerset(vnames))))
combo_dicts = {}
keys = range(len(combos))
for i in keys:
combo_dicts[i] = cars[combos[i]]
gower_dicts = valmap(gower.gower_matrix, combo_dicts)
but I am receiving the following error
TypeError: ufunc 'true_divide' output (typecode 'd')
could not be coerced to provided output parameter (typecode 'q')
according to the casting rule ''same_kind''
Applying it to a particular dictionary item works
gower.gower_matrix(combo_dicts[100])
array([[0. , 0.02173357, 0.19395797, ..., 0.12646227, 0.35655078,
0.11454861],
[0.02173357, 0. , 0.21569154, ..., 0.12262693, 0.3348172 ,
0.10900868],
[0.19395797, 0.21569154, 0. , ..., 0.32042024, 0.55050874,
0.10668287],
...,
[0.12646227, 0.12262693, 0.32042024, ..., 0. , 0.23008852,
0.21544196],
[0.35655078, 0.3348172 , 0.55050874, ..., 0.23008852, 0. ,
0.44382587],
[0.11454861, 0.10900868, 0.10668287, ..., 0.21544196, 0.44382587,
0. ]], dtype=float32)
Any idea on the issue?
Based on a web search for ufunc 'true_divide' output
, it appears that the error occurs (not a Numpy bug, but behaviour that changed several years ago) when attempting to divide an array of integer values through by a floating-point value. It appears to be an unspecified requirement of the gower
package that you pass in floating-point values. So convert the cars
data first. My guess is that you have some columns that contain floating-point values and some that contain integers; the test element of combo_dicts
works fine because it happens to have been produced only from floating-point columns.