I'm working on a machine learning (IMAGE CLASSIFICATION) and I found a data set that has two files:
{
"<image_number>": {
"image_filepath": "images/<image_number>.jpg",
"anomaly_class": "<class_name>"
},
...
}
So I'm trying to read the JSON file and split the data set so I can deal with each class individually.. Then take 80% of "each class" as a training set and 20% for the testing set
I tried to find a way to match the JSON file with the dataset (images) So I can classify the classes in individual folders then divide them into training and testing sets
Anyone can help me with that?
THANK YOU
Something like the following would create folders for each of the classes and then move the images into them.
import json
import os
from os import path
# Open the json file containing the classifications
with open("clasification.json", "r") as f:
classification = json.load(f)
# Create a set which contains all the classes
classes = set([i["anomaly_class"] for i in classification.values()])
# For each of the classes make a folder to contain them
for c in classes:
os.makedirs(c)
# For each image entry in the json move the image to the folder named it's class
for image_number, image_data in classification.items():
os.rename(image_data["image_filepath"], path.join(image_data["anomaly_class"], "{}.jpg".format(image_number)))