Search code examples
binarydatasetcluster-analysisdistanceordinal

What is the distance function type for mixed type data types?


Dears,

In clustering, the selection of distance function I think it depends on the type of data. What about if we have a mixed type data types like (continuous) and categorical(nominal and/or ordinal) scale and binary nominal variable? Is there is any guide line for a specific distance function type in this case? If not I would like what is the suitable choice for binary nominal variable?

Thank you, Shosho


Solution

  • The book "Finding Groups in Data" by Kaufman and Rousseeuw covers a decent range of algorithms for different types of data, and gives some explanation on what to do about mixed variable types. They include information on binary variables.

    https://onlinelibrary.wiley.com/doi/book/10.1002/9780470316801