Let's say I have a defaultdict
in the following form:
theta = defaultdict(float)
The key consists of a tuple of strings i.e. (label, word)
, and the associated value is the probability that the given word fits the given label (part of speech tagging).
For example, the word 'stand' could be a noun or a verb. So I could do something like:
theta[('NOUN', 'stand')] = 0.4
theta[('VERB', 'stand')] = 0.6
theta[('ADJ', 'stand')] = 0.0
and so on for the remaining parts of speech labels.
What I need to do is have the dictionary return a value of 1 by default if it is invoked with a word that it does not contain and the associated label is 'NOUN', and return 0 for all other associated labels. For example:
value = theta[('NOUN', 'wordthatdoesntexist')] # this should be 1
value = theta[('VERB', 'wordthatdoesntexist')] # this should be 0
How can I do this? Can I do it in the initialization step, using lambda? Or is there some other way?
A defaultdict can't do that; the default factory doesn't have access to the key. You'd have to write your own dict subclass, using the __missing__
hook dicts look for when you try to access a missing key:
class SomeAppropriateName(dict):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __missing__(self, key):
val = 1.0 if key[0] == 'NOUN' else 0.0
# Uncomment the following line if you want to add the value to the dict
# self[key] = val
return val