Search code examples
pythonlistaggregate-functions

Group/Count list of dictionaries based on value


I've got a list of Tokens which looks something like:

[{
    Value: "Blah",
    StartOffset: 0,
    EndOffset: 4
}, ... ]

What I want to do is get a count of how many times each value occurs in the list of tokens.

In VB.Net I'd do something like...

Tokens = Tokens.
GroupBy(Function(x) x.Value).
Select(Function(g) New With {
           .Value = g.Key,
           .Count = g.Count})

What's the equivalent in Python?


Solution

  • IIUC, you can use collections.Counter:

    >>> from collections import Counter
    >>> tokens = [{"Value": "Blah", "SO": 0}, {"Value": "zoom", "SO": 5}, {"Value": "Blah", "SO": 2}, {"Value": "Blah", "SO": 3}]
    >>> Counter(tok['Value'] for tok in tokens)
    Counter({'Blah': 3, 'zoom': 1})
    

    if you only need a count. If you want them grouped by the value, you could use itertools.groupby and something like:

    >>> from itertools import groupby
    >>> def keyfn(x):
            return x['Value']
    ... 
    >>> [(k, list(g)) for k,g in groupby(sorted(tokens, key=keyfn), keyfn)]
    [('Blah', [{'SO': 0, 'Value': 'Blah'}, {'SO': 2, 'Value': 'Blah'}, {'SO': 3, 'Value': 'Blah'}]), ('zoom', [{'SO': 5, 'Value': 'zoom'}])]
    

    although it's a little trickier because groupby requires the grouped terms to be contiguous, and so you have to sort by the key first.