I have a huge number of images with their labels (.mat) file (cannot use tf.data.Dataset.from_tensor_slices()
) and I want to use tf.data
API to make a tensorflow dataset out of it.
As I read in the documentation, I can use tf.data.TextLineDataset
for large number of data(I have to have a txt file with the address of all the images and send the path of the txt file as tf.data.TextLineDataset
argument).
Then, I can use map
method to read txt file (tf.read_file
) decode jpg image (tf.image.decode_jpeg
) and do some basic transformation on the image.
However, I cannot use scipy.io.loadmat
in any part of map
method because I have no string indicating the path to the mat file. All I have is tf.Tensor
.
I don't think that reading all images and making a TFRecord out of it is that much efficient in this case because then I am basically doing every thing two times. Once, reading the whole images and making TFRecord, and once again, reading TFRecord to make tensorflow dataset.
Any idea how I can resolve this issue?
This is my code:
dataset = tf.data.TextLineDataset(txt_file).map(read_img_and_mat)
and then:
def read_img_and_mat(path):
image_string = tf.read_file(path)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
label = ... # get label from mat file
return image_decoded, label
I found a way to do it using tf.data.from_generator
The trick I found was to make two separate Dataset (one for mat file and one for the jpg file) and then to combine them using tf.data.Dataset.zip
Here is how it works:
mat_dataset = tf.data.Dataset.from_generator(read_mat_file, tf.int64)
def read_mat_file():
while True:
with open('mat_addresses.txt', 'r') as input_:
for line in input_:
# open file and extract info from it as np.array
yield tuple(label) # why tuple? https://github.com/tensorflow/tensorflow/issues/13101
in order to get the next batch one just have to do:
iter = mat_dataset.make_one_shot_iterator()
sess.run(iter.get_next())
however, one can make img_dataset
and combine it with mat_dataset
like this:
img_dataset = tf.data.TextLineDataset('img_addresses.txt').map(read_img)
def read_img(path):
image_string = tf.read_file(path)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
return image_decoded
dataset = tf.data.Dataset.zip((mat_dataset, img_dataset))
and now, getting next next batch like mentioned.
PS. I have no idea about how efficient the code works in comparison to feed_dict