Search code examples
pythonpython-itertools

Group dict values into chunks with python


I'm trying to figure out a way to group dictionary values into intervals, depending on the value of a key.

In my case, I have two keys: 'timestamp' and 'value'; I need to group it by intervals based on the value. My data structure is this one:

[{'timestamp': u'1389631816', 'value': u'0'},
 {'timestamp': u'1389633136', 'value': u'0'},
 {'timestamp': u'1389633256', 'value': u'1'},
 {'timestamp': u'1389633316', 'value': u'1'},
 {'timestamp': u'1389633196', 'value': u'0'},
 {'timestamp': u'1389633196', 'value': u'0'},
 {'timestamp': u'1389633196', 'value': u'0'},
 {'timestamp': u'1389633316', 'value': u'1'}]

In this case, I should have 4 groups:

First Group: 2 items, based on value '0';
Second Group: 2 items, based on value '1';
Third Group: 3 items, based on value '0';
Fourth Group: 1 item, based on value '1'.

For all purposes, I need metrics between times of these groups (Coming for ICMP checks from the Zabbix in this example) to create a report, but I'm really stuck here.


Solution

  • Use the itertools.groupby() function to group these:

    from itertools import groupby
    from operator import itemgetter
    
    for value, group in groupby(list_of_dicts, key=itemgetter('value')):
        print 'Group for value {}'.format(value)
        for d in group:
            print d
    

    Demo:

    >>> from itertools import groupby
    >>> from operator import itemgetter
    >>> list_of_dicts = [{'timestamp': u'1389631816', 'value': u'0'},
    ...  {'timestamp': u'1389633136', 'value': u'0'},
    ...  {'timestamp': u'1389633256', 'value': u'1'},
    ...  {'timestamp': u'1389633316', 'value': u'1'},
    ...  {'timestamp': u'1389633196', 'value': u'0'},
    ...  {'timestamp': u'1389633196', 'value': u'0'},
    ...  {'timestamp': u'1389633196', 'value': u'0'},
    ...  {'timestamp': u'1389633316', 'value': u'1'}]
    >>> for value, group in groupby(list_of_dicts, key=itemgetter('value')):
    ...     print 'Group for value {}'.format(value)
    ...     for d in group:
    ...         print d
    ... 
    Group for value 0
    {'timestamp': u'1389631816', 'value': u'0'}
    {'timestamp': u'1389633136', 'value': u'0'}
    Group for value 1
    {'timestamp': u'1389633256', 'value': u'1'}
    {'timestamp': u'1389633316', 'value': u'1'}
    Group for value 0
    {'timestamp': u'1389633196', 'value': u'0'}
    {'timestamp': u'1389633196', 'value': u'0'}
    {'timestamp': u'1389633196', 'value': u'0'}
    Group for value 1
    {'timestamp': u'1389633316', 'value': u'1'}