Assume I have the following variables:
import gensim
from gensim.models import KeyedVectors
wv = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
dict_dict = {
"abc": ('dog', 'cat', 'bat'),
"def": ('fat', 'hat', 'rat')
}
In this situation, wv
is a word2vec model.
I want to take the values of each key in dict_dict
, extract the value's vector (e.g. wv['dog']
), and have the value now acts as a key to a sub dictionary:
dict_dict = {
"abc": ({'dog': array([ 5.12695312e-02, -2.23388672e-02, -1.72851562e-01, 1.61132812e-01]), {'cat':array([ 5.12695312e-02, -2.23388672e-02, -1.72851562e-01, 1.61132812e-01]), array([ 5.12695312e-02, -2.23388672e-02, -1.72851562e-01, 1.61132812e-01]):'bat')}
Would I have to create a new dictionary to do this?
You could replace each of the existing dict_dict
values – which are currently tuples – with new dicts, created via a Python dict comprehension. For example:
for key in dict_dict.keys():
words = dict_dict[key]
subdict = { word : wv[word] for word in words}
dict_dict[key] = subdict