Search code examples
pythondictionaryn-gramshelve

python populate a shelve object/dictionary with multiple keys


I have a list of 4-grams that I want to populate a dictionary object/shevle object with:

['I','go','to','work']
['I','go','there','often']
['it','is','nice','being']
['I','live','in','NY']
['I','go','to','work']

So that we have something like:

four_grams['I']['go']['to']['work']=1

and any newly encountered 4-gram is populated with its four keys, with the value 1, and its value is incremented if it is encountered again.


Solution

  • You could do something like this:

    import shelve
    
    from collections import defaultdict
    
    db = shelve.open('/tmp/db')
    
    grams = [
        ['I','go','to','work'],
        ['I','go','there','often'],
        ['it','is','nice','being'],
        ['I','live','in','NY'],
        ['I','go','to','work'],
    ]
    
    for gram in grams:
        path = db.get(gram[0], defaultdict(int))
    
        def f(path, word):
            if not word in path:
                path[word] = defaultdict(int)
            return path[word]
        reduce(f, gram[1:-1], path)[gram[-1]] += 1
    
        db[gram[0]] = path
    
    print db
    
    db.close()