Gensim Word2Vec offers a system for inferring analogous relationships, that is, with the "same shape" as those already found?
Es: Starting from King, Queen
I would like to get other couples with male / female gender.
In other word: most_similar(positive=['king', X], negative=['queen']) -> Y
I would like to find as many xy pairs.
There's no built-in facility resembling what I think you're asking.
But, you are of course free to cycle through any number of candidate words (as X
, or the other arguments to most_similar()
), to see what top-neighbors are reported (candidate Y
values) - perhaps applying some threshold of similarity.
Note the famous man:king :: woman: _?_
is usually presented to a word2vec model in Gensim as most_similar(positive=['king', 'woman'], negative=['man'])
, which sort of achieves king - man + woman = _?_
. I'm not sure your alternate formulation, effectively king - queen + X = Y
has an analogical meaning, for arbitrary X
or responses Y
.
And, note that most_similar()
suppresses the reporting of any candidate wards that are already arguments to positive
or negative
. Often, the results of the 'artihmetic' are still closer to the input words than anything else - but that won't be reported, showing next-best words instead.