Search code examples
python-2.7fileglob

How can l re-order files with glob according to their last digits after ( _)?


l would like to reorder my files with respect to their values.

To do so l use glob files to load and reorder them.

import glob as glob
import os

features_directory='./features/'
labels_directory='./labels/'

os.chdir(features_directory)
Features=glob.glob("*.npy")# len(Features)=13000

os.chdir(labels_directory)
Labels=glob.glob("*.npy") # len(Labels)=13000

However they are not ordered even when l make sorted()

Features=sorted(glob.glob(("*.npy"))
Labels=sorted(glob.glob(("*.npy"))

print(Features)  

results

['features_train_data_10.npy','features_train_data_123.npy',...,'features_train_data_13000.npy'] 

and

print(Labels)

results ['labels_train_data_98.npy','labels_train_data_45.npy',...,'labels_train_data_117.npy']

Expected output :

['features_train_data_1.npy','features_train_data_2.npy',...,'features_train_data_13000.npy'] 
['labels_train_data_1.npy','labels_train_data_2.npy',...,'labels_train_data_13000.npy']

Thank you for your help


Solution

  • By default strings are sorted lexicographically. You should specify a key function for sorted to sort by the number portion of the file names:

    import re
    Features=sorted(glob.glob("*.npy"), key=lambda n: int(re.findall(r'\d+', n)[0]))