Given a defaultdict(dict)
with both the keys as strings and their value is float, then how do I find the k keys based on descending order of the value from the entire defaultdict.
I can write 2 loops and trivially store a list of 100 pairs, and replace the list with a new entry if something with more than exisitng value is found in n^2 times. However, is there a proper way of doing this in Python.
sample file
defaultdict(<type 'dict'>, {u'just': {u'don': 24.163775416342308, u'like': 28.68171888897304, u'make': 21.69210433035232},'like':{'just':28.68171888897304,'don':12.34, 'mike':27.675}}
desired output (assuming I need only top 3 valued entries from the entire collection)
just,like, 28.68171
like,mike, 27.675
just,don, 24.16377
Whenever you're extracting largest and smallests values, the heapq.nlargest and heapq.nsmallest functions are your new best friend:
>>> from heapq import nlargest
>>> from operator import itemgetter
>>> from pprint import pprint
>>> d = defaultdict(dict,
{'just': {'don': 24.163775416342308,
'like': 28.68171888897304,
'make': 21.69210433035232},
'like': {'don': 12.34,
'just': 28.68171888897304,
'mike': 27.675}})
>>> flattened = ((outerkey, innerkey, value) for outerkey, innerdict in d.items()
for innerkey, value in innerdict.items())
>>> result = nlargest(3, flattened, key=itemgetter(2))
>>> pprint(result)
[('just', 'like', 28.68171888897304),
('like', 'just', 28.68171888897304),
('like', 'mike', 27.675)]
In Python 2, it would be more space efficient to use iteritems()
instead of items()
.