I have a "bird" folder which has 11345 bird images named as 1.jpg, 2.jpg, 3.jpg......11344.jpg, 11345.jpg. I need to save these birds images as "filenames.pickle" to use it in furthur machine learning modules. Data should be arranged in this manner: dataset/train/filenames.pickle, dataset/test/filenames.pickle
I need to create one single pickle file filenames.pickle to get all 11345 bird images. I am very much confuse how should I add this images into pickle so that my code take pickle file but it reached to these images at the end to train the Machine learning Models.
from PIL import Image
import pickle
'''
I am just trying to convert one image into pickle to get an idea.
if is succefully convert into pickle then I will read all the
images inside the "bird" folder and convert all of them into one
single pickle file
'''
# converting an image into pickle
img = Image.open('059.jpg')
with open('059.pickle', 'wb') as f:
pickle.dump(img, f)
## read the pickle file
with open('059.pickle','rb') as f:
file = pickle.load(f)
print(file)
# after reading 059.pickle file :
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x2115BE59190>
# I dont want ( <PIL.JpegImagePlugin.JpegImageFile image mode=RGB
size=500x375 at 0x2115BE59190>) this result into pickle file.
# I want pickle file to save result like this: ['59.jpg'].
## to convert whole images inside bird folder
## input folder = bird\\images\\all_images_in_jpg_format
image = "bird\\images\\"
fout = open("bird\\filenames.pickle",'wb')
pickle.dump(image,fout)
fout.close()
with open("bird\\filenames.pickle",'rb') as f:
file = pickle.load(f)
print(file)
# output : bird\images\
## the above output is wrong
'''
becasue when I am done reading all the images and create one
pickle file as "filenames.pickle:, it should save images like
this:
['01.jpg','0342.jpg','06762.jpg', '06752.jpg', '05122.jpg',
'05144.jpg', '06635.jpg','06638.jpg',
'05632.jpg',......'11345.jpg']
and after reading this pickle file, somehow model will
automatcally read the images via pickle file.
'''
I am not much familiar with pickle file and it's format. Can anyone please help me or give me some suggestions How should I tackle this problem and solve it? And how will the model read the images via pickle file? What does pickle file conains (image data and pixel information or just name of image files) so that Model can take pickle file and learn the images in training time?
Modification of my original answer. Now I pickle file names in one file, and pickle images into another file.
from PIL import Image
import os
import pickle
from glob import glob
## to convert whole images inside bird folder
## input folder = bird\\images\\all_images_in_jpg_format
PICKLE_FILE = "bird\\filenames.pickle"
SOURCE_DIRECTORY = "bird\\images\\"
PICKLE_IMAGES = "bird\\images.pickle"
path_list = glob(os.path.join(SOURCE_DIRECTORY, "*.jpg"))
# pickle images into big pickle file
with open(PICKLE_IMAGES,"wb") as f:
for file_name in path_list:
pickle.dump(Image.open(file_name),f)
# get short names from the path list
file_list = list(
map(
lambda x: os.path.basename(x), path_list)
)
# pickle short name list
pickle.dump(file_list, open(PICKLE_FILE, 'wb'))
# test that we can reread the list
recovered_list = pickle.load(open(PICKLE_FILE,"rb"))
if file_list == recovered_list:
print("Lists Match!")
else:
print("Lists Don't Match!!!")
# read a couple images out of the image file:
display_count = 5
with open(PICKLE_IMAGES,"rb") as f:
while True:
try:
pickle.load(f).show()
display_count -= 1
if display_count <= 0:
break
except EOFerror as e:
break
It may still be your trainer wants individual pickled images, or it doesn't like the image format used by PIL.