Search code examples
pythonlistsortinggroupconnected-components

in python sort and group by common elements in a list


I will try to express myself clearly. Would anyone know how to order elements with common values in a list? I made a drawing to explain it better. enter image description here I have a list with several letters and their two number.

ev=(('A',(1,2)), ('M',(0,40)),('S',(17,32)),('Z',(2,7)),('K',(7,12)),('U',(40,18)),('R',(32,5)),('V',(28,47)),('X',(5,28)))

and after sort I would like get this result. but a don't knwo how find a efficient way.

grp=( ( ('A',(1,2)), ('Z',(2,7)), ('K',(7,12)) ), (  ('M',(0,40)), ('U',(40,18))  ), (  ('S',(17,32)), ('R',(32,5)), ('X',(5,28)), ('V',(28,47))  )

Solution

  • Store the nodes in a more suitable datatype - a dict. The keys are the first (tail) numeric values, and the values are the tuple values of ev, e.g. ('A', (1, 2)). Just walk through the nodes until the next node doesn't exist. That's a group done. Walk through each node that has not been visited. That's all the groups done.

    ev = (('A', (1, 2)), 
          ('M', (0, 40)), 
          ('S', (17, 32)), 
          ('Z', (2, 7)), 
          ('K', (7, 12)), 
          ('U', (40, 18)), 
          ('R', (32, 5)), 
          ('V', (28, 47)), 
          ('X', (5, 28)))
    
    # e.g. nodes[1] = ('A', (1, 2))
    nodes = {tail: (letter, (tail, head)) for letter, (tail, head) in ev}
    
    visited = set()
    groups = []
    
    for tail in nodes:
        if tail in visited:
            continue
        
        group = []
        while True:
            group.append(nodes[tail])
            visited.add(tail)
            
            # Get this node's head, which is the next node's tail.
            tail = nodes[tail][1][1]
            
            # There are no more nodes in the group, so append it to the list of groups
            if tail not in nodes:
                groups.append(tuple(group))
                break
        
    groups = tuple(groups)
    
    print(groups)
    

    Output:

    ((('A', (1, 2)), ('Z', (2, 7)), ('K', (7, 12))), (('M', (0, 40)), ('U', (40, 18))), (('S', (17, 32)), ('R', (32, 5)), ('X', (5, 28)), ('V', (28, 47))))