I'm using the clustering module in python's scikit learn, and I'd like to use a Normalized Euclidean Distance. There is no built-in distance for this (that i know of) Here's a list.
So, I want to implement my own Normalized Euclidean Distance using a callable. The function is part of my distance
module and is called distance.normalized_euclidean_distance
. It takes three inputs: X
, and SD
However, Normalized Euclidean Distance requires standard deviation for the population sample. But, the pairwise distance in scipy only allows two inputs: X
and Y
How do I allow it to take an additional argument?
I tried putting it in as a **kwarg
, but that didn't seem to work:
cluster = DBSCAN(eps=1.0, min_samples=1,metric = distance.normalized_euclidean, SD = stdv)
where distance.normalized_euclidean
is the function that I wrote that takes in two arrays, X
and Y
and computes the normalized euclidean distance between them.
...but this throws an error:
TypeError: __init__() got an unexpected keyword argument 'SD'
What is the way to use additional keyword arguments?
Here it says Any further parameters are passed directly to the distance function.
, which made me think that this would be acceptable.
You can use a lambda function as metric which takes two input arrays:
cluster = DBSCAN(eps=1.0, min_samples=1,metric=lambda X, Y: distance.normalized_euclidean(X, Y, SD=stdv))