Search code examples
pythoncsvfilenames

Find filename with highest integer in name


Summary

If FCR_Network_Coordinates_0 and FCR_Network_Coordinates_2 exists, it should write to file FCR_Network_Coordinates_3 and not to FCR_Network_Coordinates_1

Details

I have the following problem:

I want to write a new csv file, if it does not exist and increase the extension number if some file was found in directory. But if as an example a file with number extension "1" exists, and one with "3", but none with "2", it should write the next file with "4". So it should add 1 to the highest number extension

My code so far is:

    index = 0
    while os.path.exists('../FCR_Network_Coordinates_'+ str(index) + '.csv'):
        index+=1            
    with open('../FCR_Network_Coordinates_'+str(index)+'.csv', 'wb') as csv_file:
        writer = csv.writer(csv_file, delimiter=";")
        for key, value in sparse1.items():
            writer.writerow(['{:.1f}'.format(t) for t in key]+value)

EDIT

It should also work for paths where parameters are added in path name

 "../FCR_Network_Coordinates_"+"r_"+radius+"x_"+x+"y_"+y+"z_"‌​+z+"fcr_"+fcr_size+"‌​_"+new_number+".csv" 

could look like:

FCR_Network_Coordinates_radius_3_x_0.3_y_0.3_z_2_fcr_2_1.csv

EDIT2

Furthermore if there are other parameters in the filename it should not look to the highest number of all files, but of the highest number of that file that have these parameters too


Solution

  • Something like the following should work for you:

    import glob
    import os
    
    # .....
    
    existing_matches = glob.glob('../FCR_Network_Coordinates_*.csv')
    
    if existing_matches:
        used_numbers = []
        for f in existing_matches:
            try:
                file_number = int(os.path.splitext(os.path.basename(f))[0].split('_')[-1])
                used_numbers.append(file_number)
            except ValueError:
                pass
        save_number = max(used_numbers) + 1
    else:
        save_number = 1
    
    with open('../FCR_Network_Coordinates_{}.csv'.format(save_number), 'wb') as csv_file:
        writer = csv.writer(csv_file, delimiter=";")
        for key, value in sparse1.items():
            writer.writerow(['{:.1f}'.format(t) for t in key] + value)
    

    glob finds all files with names similar to your pattern, where * is used as a wildcard.

    We then use os.path to manipulate each filename and work out what the number in the name is:

    • os.path.basename() gets us just the filename - e.g. 'FCR_Network_Coordinates_1.csv'
    • os.path.splitext() splits the file name ('FCR_Network_Coordinates_1') from the extension ('.csv'). Taking the element at index 0 gets us the filename rather than the extension
    • splitting this based on '_' splits this every time there is an '_' - resulting in a list of ['FCR', 'Network', 'Coordinates', '1']. Taking the index -1 gets us the last entry in this list, i.e. the 1.
    • we have to wrap this as an int() to be able to apply numeric operations to it.

    We also catch an error in case there is some filename using letters rather than numbers after the underscore. Then, we take the max of the numbers found and add one. If no numbers have been found, we use 1 for the filename.

    EDIT: In response to the question update, we just need to alter our glob and the final name we write to - the glob changes to:

    existing_matches = glob.glob('../FCR_Network_Coordinates_r_{}_x_{}_y_{}_z_{}_fcr_{}_*.csv'.format(
        radius, x, y, z, fcr_size))
    

    and the file opening line changes to:

    with open('../FCR_Network_Coordinates_r_{}_x_{}_y_{}_z_{}_fcr_{}_{}.csv'.format(
            radius, x, y, z, fcr_size, save_number), 'wb') as csv_file: