Search code examples
pythonstringgroup-bypython-itertools

Using itertools groupby, create groups of elements, if ANY key is same in each element


Given a list of strings, how to group them if any value is similar?

inputList = ['w', 'd', 'c', 'm', 'w d', 'm c', 'd w', 'c m', 'o', 'p']

desiredOutput = [['d w', 'd', 'w', 'w d',], ['c', 'c m', 'm', 'm c'], ['o'], ['p']]

How to sort a list properly by first, next, and last items?

My sorting attempt:

groupedList = sorted(inputList, key=lambda ch: [c for c in ch.split()])

Output:

['c', 'c m', 'd', 'd w', 'm', 'm c', 'o', 'p', 'w', 'w d']

Desired output:

['c', 'c m', 'm c', 'm', 'd', 'd w', 'w', 'w d', 'o', 'p']

My grouping attempt:

b = sorted(g, key=lambda elem: [i1[0] for i1 in elem[0].split()]) # sort by all first characters
b = groupby(b, key=lambda elem: [i1[0] in elem[0].split()[:-1] for i1 in elem[0].split()[:-1]])
b = [[item for item in data] for (key, data) in b]

Output:

[[('c winnicott', 3), ('d winnicott', 2)], [('d w winnicott', 2), ('w d winnicott', 1)], [('w winnicott', 1)]]

Desired output:

[[('c winnicott', 3)], [('d winnicott', 2), ('d w winnicott', 2), ('w d winnicott', 1), ('w winnicott', 1)]]

Solution

  • I did it with the bubble sort algorithm.

    def bubbleSort(arr):
    n = len(arr)
    swapped = False
    
    for i in range(n-1):
        for j in range(0, n-i-1):
            
            g1 = arr[j][0].split()
            g2 = arr[j + 1][0].split()
            
            if any([k > l for k in g1] for l in g2):
    
                swapped = True
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                
                if any(s in g2 for s in g1):
                    arr[j].extend(arr[j + 1])
                    arr[j + 1] = ['-']
         
        if not swapped:
            return arr
        
    arr = [a for a in arr if a[0]!='-']
    return arr
    
    inputList = ['w', 'd', 'c', 'm', 'w d', 'm c', 'd w', 'c m', 'o', 'p']
    #inputList = ["m", "d", "w d", "m c", "c d"]
    
    inputList = [[n] for n in inputList]
    
    print(bubbleSort(inputList))
    

    Output:

    [['p'], ['o'], ['c m', 'm c', 'c', 'm'], ['d w', 'w d', 'w', 'd']]