I have a folder containing 1300 .JPEG files all of which have filenames in a specific order.
The order of each file name is category_count_randomString.JPEG. To give an example, below is one image from the folder:
13_2_5jdf.JPEG
where 13 is the category, 2 is the count of that category in the image, followed by the random string.
I'd like to be able to:
For now, I've just loaded the images (not yet as an array) using the glob function.
import glob
data = '/Users/Data'
images = glob.glob(data+'/*.JPEG')
I'm new to coding and so I'm looking for someone to be able to provide 'idiot-proof' lines of coding that I can just incorporate into my notebook to make this work.
You can use os
to get a list of all your files in your data directory and the split
command to get at the information in your filename:
import os
data_path = "/Users/Data"
categories = []
counts = []
rand_strs = []
for img_filename in os.listdir(data_path):
if img_filename.endswith(".JPEG"):
category, count, rand_str = img_filename.split('.')[0].split('_')
categories.append(category)
counts.append(int(count))
rand_strs.append(rand_str)
Each list is then indexed the same, so for example if you wanted to know how many counts you have for category 13, you can do
category_idx = categories.index('13')
print "Category %s has %d elements" % (categories[category_idx], counts[category_idx])