I have a list of filenames that I need to sort based on a section within the string. However, it only works if I make the file extension part of my sorting dictionary. I want this to work if the file is a .jpg or a .png, so I am trying to split on both the '_' and the '.' character.
sorting = ['FRONT', 'BACK', 'LEFT', 'RIGHT', 'INGREDIENTS', 'INSTRUCTIONS', 'INFO', 'NUTRITION', 'PRODUCT']
filelist = ['3006345_2234661_ENG_PRODUCT.jpg', '3006345_2234661_ENG_FRONT.jpg', '3006345_2234661_ENG_LEFT.jpg', '3006345_2234661_ENG_RIGHT.jpg', '3006345_2234661_ENG_BACK.jpg', '3006345_2234661_ENG_INGREDIENTS.jpg', '3006345_2234661_ENG_NUTRITION.jpg', '3006345_2234661_ENG_INSTRUCTIONS.jpg', '3006345_2234661_ENG_INFO.jpg']
sort = sorted(filelist, key = lambda x : sorting.index(x.re.split('_|.')[3]))
print(sort)
This returns the error "AttributeError: 'str' object has no attribute 're'"
What do I need to do to split on both the _ and . when splitting out my strings for sorting? I only want to use the split for the sorting, not for re-forming the strings.
Here's the fixed code:
sorted_output = sorted(filelist,key=lambda x: sorting.index(re.split(r'_|\.',x)[3]))
The string input to re.split()
should be passed as the second argument to the function; you do not call re.split()
on a string. The first argument is the regular expression itself which you had correct.
Also: you need to escape the .
with a \
because the full-stop or period is a special character in regular expressions which matches everything.
Output:
In [13]: sorted(filelist,key=lambda x: sorting.index(re.split(r'_|\.',x)[3]))
Out[13]:
['3006345_2234661_ENG_FRONT.jpg',
'3006345_2234661_ENG_BACK.jpg',
'3006345_2234661_ENG_LEFT.jpg',
'3006345_2234661_ENG_RIGHT.jpg',
'3006345_2234661_ENG_INGREDIENTS.jpg',
'3006345_2234661_ENG_INSTRUCTIONS.jpg',
'3006345_2234661_ENG_INFO.jpg',
'3006345_2234661_ENG_NUTRITION.jpg',
'3006345_2234661_ENG_PRODUCT.jpg']
Edit: as @Todd mentions in the comments, if you want to additionally ensure that the strings are sorted by the numeric part after the first sort takes place then use:
sorted(filelist,key=lambda x: [sorting.index(re.split(r'_|\.',x)[3]),x])