Search code examples
pythonlist-comprehension

List comprehension with pattern match in Python


I have one list named columns, and I have to create one nested list based on a split of the elements (the first three).

For example, I will divide this element '101 Drive 1 A' in '101 Drive 1' and make a group.

columns = ['101 Drive 1 A','101 Drive 1 B','102 Drive 2 A','102 Drive 2 B','102 Drive 2 C','103 Drive 1 A']

The output will look like this:

[
  ['101 Drive 1 A', '101 Drive 1 B'],
  ['102 Drive 2 A', '102 Drive 2 B', '102 Drive 2 C'],
  ['103 Drive 1 A']
]

Solution

  • One approach using collections.defaultdict:

    from collections import defaultdict
    
    columns = ['101 Drive 1 A', '101 Drive 1 B', '102 Drive 2 A', '102 Drive 2 B', '102 Drive 2 C', '103 Drive 1 A']
    
    groups = defaultdict(list)
    for column in columns:
        key = column[:3]
        groups[key].append(column)
    
    res = list(groups.values())
    print(res)
    

    Output

    [['101 Drive 1 A', '101 Drive 1 B'], ['102 Drive 2 A', '102 Drive 2 B', '102 Drive 2 C'], ['103 Drive 1 A']]
    

    A more robust alternative, that is going to work for any number at the beginning of the string, is to use a regular expression:

    import re
    
    groups = defaultdict(list)
    for column in columns:
        key = re.match("\d+", column).group()
        groups[key].append(column)
    
    res = list(groups.values())
    print(res)