Search code examples
pythonstringlistsortingnumeric

How can I sort a list of strings in ascending order of numeric part


I have a list of paths which have multiple numerical parts within them, here is part of it:

'C:\\Python\\Python310\\Scripts\\mockup_test\\17mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\18mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\19mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\1mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\20mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\21mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\29mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\2mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\30mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\31mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\38mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\39mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\3mm.JPG'

And using .sort() doesn't change it because it thinks it is already sorted.

Here is what it should be:

Expected result:

'C:\\Python\\Python310\\Scripts\\mockup_test\\1mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\2mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\3mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\17mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\18mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\19mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\20mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\21mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\29mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\30mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\31mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\38mm.JPG',
'C:\\Python\\Python310\\Scripts\\mockup_test\\39mm.JPG'

Does anyone know how this could be achieved?


Solution

  • The following sorts a list by all the parts of string found, where consecutive digits portions are to be considered as int, and the others as str:

    import re 
    
    def split_str_int(s):
        a = re.split(r'(\d+)', s)
        a[1::2] = map(int, a[1::2])
        return a
    
    newlist = sorted(mylist, key=split_str_int)
    

    On your data:

    >>> newlist
    ['C:\\Python\\Python310\\Scripts\\mockup_test\\1mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\2mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\3mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\17mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\18mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\19mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\20mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\21mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\29mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\30mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\31mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\38mm.JPG',
     'C:\\Python\\Python310\\Scripts\\mockup_test\\39mm.JPG']
    

    Note also that the above will sort according to all of the parts found in the strings (both numerical and non-numerical). This is to comply with: "(...) which have multiple numerical parts".

    For example:

    mylist = [
        'ab6cd45',
        'ab6cd2',
        'a6cd3',
        'ab4cd60',
        'a',
    ]
    >>> sorted(mylist, key=split_str_int)
    ['a', 'a6cd3', 'ab4cd60', 'ab6cd2', 'ab6cd45']