Search code examples
pythonstringlistsorting

Sort string by multiple separated numbers


I have a list of paths, which I have simplified into similar but simpler strings here:

paths = ['apple10/banana2/carrot1', 'apple10/banana1/carrot2', 'apple2/banana1', 'apple2/banana2', 'apple1/banana1', 'apple1/banana2', 'apple10/banana1/carrot1']

These paths need sorting in the order of the numbers. Ths first number (apple) is the most important in the search, followed by the second.

One added complication which may be clear is some of the paths will have a 3rd directory the data are within while others do not.

The MWE of the path structure looks as below:

parent 
|-----apple1 
          |------banana1 
                   |----- data*
          |------banana2 
                   |----- data*
|-----apple2
          |------banana1 
                   |----- data*
          |------banana2 
                   |----- data*
|-----apple10
          |------banana1 
                   |-----carrot1
                            |-----data*
                   |-----carrot2
                            |-----data*
          |------banana2 
                   |----- carrot1
                             |-----data*

The desired output is:

paths = ['apple1/banana1', 'apple1/banana2', 'apple2/banana1', 'apple2/banana2', 'apple10/banana1/carrot1', 'apple10/banana1/carrot2','apple10/banana2/carrot1']

I'm struggling to work out how to do this. sort will not work especially as the numbers will go into double digits and 10 would come before 2.

I have seen another answer which works with single numbers in a list of strings. How to correctly sort a string with a number inside? I've failed to adapt this to my problem.


Solution

  • Try with sorted, supplying a custom key that uses re to extract all numbers from the path:

    import re
    
    >>> sorted(paths, key=lambda x: list(map(int,re.findall("(\d+)", x))))
    ['apple1/banana1',
     'apple1/banana2',
     'apple2/banana1',
     'apple2/banana2',
     'apple10/banana1/carrot1',
     'apple10/banana1/carrot2',
     'apple10/banana2/carrot1']