Search code examples
pythonsortingword2veccosine-similarity

Sort dictionary python by value (word2vec)


I want to sort my dict by value, but if I apply this code it doesn't work (it print only my key-value pairs without any kind of sorting). If I change key=lambda x: x[1] to x[0] it correctly sort by key, so I don't understand what I'm doing wrong.

My code:

from gensim.models.word2vec import Word2Vec
from scipy.spatial.distance import cosine

e_science = Word2Vec.load("clean_corpus_science.model")
e_pokemon = Word2Vec.load("clean_corpus_pokemon.model")

science_vocab = list(e_science.wv.vocab)
pokemon_vocab = list(e_pokemon.wv.vocab)

vocab_intersection = list(set(science_vocab).intersection(set(pokemon_vocab)))

similarity = []
for i in range(0, len(vocab_intersection)):
  similarity.append(1-cosine(e_science[vocab_intersection[i]], e_pokemon[vocab_intersection[i]]))

hashmap = {}
for i in range(0, len(similarity)):
  hashmap[vocab_intersection[i]] = {similarity[i]} 

dict(sorted(hashmap.items(), key=lambda x: x[1]))

Solution

  • You're trying to sort sets, and Python isn't sure how to order them. Take your scores out of the sets, and then you can sort as expected.

    dict(sorted(hashmap.items(), key=lambda x: tuple(x[1])[0]))
    

    That's pretty ugly though, you may want to do the cleanup in a separate step.