I have a list of 4-grams that I want to populate a dictionary object/shevle object with:
['I','go','to','work']
['I','go','there','often']
['it','is','nice','being']
['I','live','in','NY']
['I','go','to','work']
So that we have something like:
four_grams['I']['go']['to']['work']=1
and any newly encountered 4-gram is populated with its four keys, with the value 1, and its value is incremented if it is encountered again.
You could do something like this:
import shelve
from collections import defaultdict
db = shelve.open('/tmp/db')
grams = [
['I','go','to','work'],
['I','go','there','often'],
['it','is','nice','being'],
['I','live','in','NY'],
['I','go','to','work'],
]
for gram in grams:
path = db.get(gram[0], defaultdict(int))
def f(path, word):
if not word in path:
path[word] = defaultdict(int)
return path[word]
reduce(f, gram[1:-1], path)[gram[-1]] += 1
db[gram[0]] = path
print db
db.close()