Search code examples
pythonlistsortingcircular-list

How to do a circular (sub)sort of sets in Python?


Consider the following minimized example:

Code:

a = [(1,'A'), (2,'A'), (3,'A'), (4,'A'), (5,'A')]
b = [(1,'B'), (2,'B'), (3,'B')]
c = []
d = [(1,'D'), (2,'D'), (3,'D'), (4,'D')]

print(sorted(a+b+c+d))

Result:

[(1, 'A'), (1, 'B'), (1, 'D'), (2, 'A'), (2, 'B'), (2, 'D'), (3, 'A'), (3, 'B'), (3, 'D'), (4, 'A'), (4, 'D'), (5, 'A')]

Python sorts the list of sets by the first item of each set and then by the second. That's fine. Now, I need the second sort order to be "circular" in strings (not sure if this is the right term for it). Furthermore, I want to specify the last string in the new ordered list. For example if I specify 'B', the ordered list should start from 'C'. If 'C' doesn't exist it should start from 'D', etc. However, it could also happen that the specified character might not be in the list, e.g. if 'C' doesn't exist the new sorted list should nevertheless start from 'D'.

Edit:

Sorry, I didn't add the desired output order of the list of sets to make it clear. Assuming I would specify mySpecialSort(myList,'B'). Then there should be first all the sets containing 1's as highest priority sort order and then the "circular" strings (here starting from 'D', since there is no C in the list).

Desired sort order:

[(1, 'D'), (1, 'A'), (1, 'B'), (2, 'D'), (2, 'A'), (2, 'B'), (3, 'D'), (3, 'A'), (3, 'B'), (4, 'D'), (4, 'A'), (5, 'A')]

or in shortened readable form: 1D, 1A, 1B, 2D, 2A, 2B, 3D, 3A, 3B, 4D, 4A, 5A

A (cumbersome) solution I came up (however so far only) for the "circular" sort on a single character list (here with duplicates) would be the following:

Code:

myList = ['A', 'D', 'E', 'G', 'Z', 'A', 'J', 'K', 'T']

def myCircularSort(myList,myLast):
    myListTmp = sorted(list(set(myList + [myLast])))                     # add myLast, remove duplicates and sort
    idx = myListTmp.index(myLast)                                        # get index of myLast
    myStart = myListTmp[(idx+1)%len(myListTmp)]                          # get the start list item
    
    myListSorted = sorted(list(set(myList)))                             # sorted original list
    print("Normal sort:                  {}".format(myListSorted))
    idx_start = myListSorted.index(myStart)                              # find start item and get its index
    myNewSort = myListSorted[idx_start:] + myListSorted[0:idx_start]     # split list and put in new order
    print("Circular sort with {} as last: {}\n".format(myLast,myNewSort))

myCircularSort(myList,'D')
myCircularSort(myList,'X')

Result:

Normal sort:                  ['A', 'D', 'E', 'G', 'J', 'K', 'T', 'Z']
Circular sort with D as last: ['E', 'G', 'J', 'K', 'T', 'Z', 'A', 'D']

Normal sort:                  ['A', 'D', 'E', 'G', 'J', 'K', 'T', 'Z']
Circular sort with X as last: ['Z', 'A', 'D', 'E', 'G', 'J', 'K', 'T']    # X actually not in the list

However, now I am stuck on how to get this "circular" sort (on the second item of the list of sets) together with the "normal" sort (on the first item of the list of sets).

Alternatively, I could possibly think of a "brute force" method to find the highest index (here: 4) and all existing strings (here: A-Z) and check the existence of each combination in two nested for-loops. Am I on the right track or would I do something horribly complicated and inefficient or am I missing some smart Python features?

Edit2:

After some further search, I guess lambda and cmp(x,y) would have done the job (see an example), but it doesn't seem to exist in Python3 anymore. So, then probably maybe something with operator.itemgetter() or operator.methodcaller() from which I still don't have a clue how to use since I'm missing good examples...


Solution

  • You can use a dict to map a letter to its correct position:

    from string import ascii_uppercase as ABC
    
    start = ABC.index('D') + 1
    
    sorter = {
        ABC[(n + start) % len(ABC)]: n
        for n in range(len(ABC))
    }
    
    myList = ['A', 'D', 'E', 'G', 'Z', 'A', 'J', 'K', 'T']
    
    print(sorted(myList, key=sorter.get))
    
    # ['E', 'G', 'J', 'K', 'T', 'Z', 'A', 'A', 'D']
    

    To work with arbitrary keywords, extract them into a keys list, rearrange it as desired and use keys.index(word) as a sort key:

    myList = [
        (1, 'ARTHUR'),
        (2, 'CHARLIE'),
        (3, 'GEORGE'),
        (4, 'HARRY'),
        (5, 'JACK'),
        (6, 'LEO'),
        (7, 'MUHAMMAD'),
        (8, 'NOAH'),
        (9, 'OLIVER'),
    ]
    
    
    def circ_sorted(lst, start):
        keys = sorted(e[1] for e in lst)
        less = sum(1 for k in keys if k <= start)
        keys = keys[less:] + keys[:less]
        return sorted(lst, key=lambda e: (keys.index(e[1]), e[0]))
    
    print(circ_sorted(myList, 'LEO')) ## [MUHAMMAD, NOAH...]
    print(circ_sorted(myList, 'IAN')) ## [JACK, LEO...]