Search code examples
machine-learningrecommendation-engineuser-profilecosine-similarity

Finding similarity between two user profiles


I have user profiles with the following attributes. U={age,sex,country,race} What is the best way to find similarity between two users? for example I have following 2 users. u1={25,M,USA,White} u2={30,M,UK,black}

I have searched and found Cosine similarity are mentioned a lot. Is it good for my problem or any other suggestions.


Solution

  • Similarity measures between object in clustering analysis is a broad subject.

    What I would suggest for You is to consider approach of 'divide and conquer'. Treat similarity between two user profiles as weighted average from all attributes similarity. Just remember to user normalized values for Your attributes similarity before doing avg. Weights for the average should be decided on the data and a use case. If you consider one of the dimension as more important when it match between two profiles it should have more weight in overall result.

    For attributes distance You can try: age -> simple Euclidian; sex, race, country -> 0/1. If You have time, distance between two countries can be better defined based on geoloc. or cultural similarity (on e.g.language, religion, political system, GDP,...). But probably experimentation with weights for final average and Your clusters result analysis would give You more payoff ;-)