Search code examples
pythonnumpydictionaryword2vec

How to add numpy arrays as values in a dictionary of dictionaries?


Assume I have the following variables:

import gensim
from gensim.models import KeyedVectors
wv = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
dict_dict = {
    "abc": ('dog', 'cat', 'bat'),
    "def": ('fat', 'hat', 'rat')
}

In this situation, wv is a word2vec model.

I want to take the values of each key in dict_dict, extract the value's vector (e.g. wv['dog']), and have the value now acts as a key to a sub dictionary:

dict_dict = {
    "abc": ({'dog': array([ 5.12695312e-02, -2.23388672e-02, -1.72851562e-01,  1.61132812e-01]), {'cat':array([ 5.12695312e-02, -2.23388672e-02, -1.72851562e-01,  1.61132812e-01]), array([ 5.12695312e-02, -2.23388672e-02, -1.72851562e-01,  1.61132812e-01]):'bat')}

Would I have to create a new dictionary to do this?


Solution

  • You could replace each of the existing dict_dict values – which are currently tuples – with new dicts, created via a Python dict comprehension. For example:

    for key in dict_dict.keys():
        words = dict_dict[key]
        subdict = { word : wv[word] for word in words}
        dict_dict[key] = subdict