python tensorflow machine-learning artificial-intelligence

loading data into X_train and Y_train

If this is the organisation of my data how would I load this data in to x_train and y_train to make a keras model

train.zip

The image files for the training set

train.txt

The labels for the training set

test.zip

The image files for the test set

This is how train.txt looks like

These are how the zip files look:

I am fumbled of how I should load this data so I can have numpy arrays for x_train, y_train and x_test and y_test so I can make a CNN model. I've tried many things but no luck

Solution

You can use a Image Data Generator. You will want to unzip your files and add a column header to the txt file. So for example like this:

Filename Label
train/0.jpg 5
train/1.jpg 21

Now you can use pandas to read the txt file then use the ImageDataGenerator:

df = pandas.read_csv("uos-com2028/train/train.txt", delim_whitespace=True)
columns = [
     "Label",
]
# you may want to rescale your image if it goes from 0 to 255
datagen = ImageDataGenerator(
     rescale=1./255.,
)
# you will want to change color_mode, batch_size, and target_size depending on your image
traindata = datagen.flow_from_dataframe(
   dataframe=df,
   directory="uos-com2028/train",
   x_col="Filename",
   y_col=columns,
   color_mode='rgb',
   batch_size=16,
   class_mode="raw",
   target_size=(256, 256),
   shuffle=True,
)

You can then use the traindata object as your training input when running model.fit()