python json dataset conv-neural-network image-classification

Is there any way to match the JSON file with the dataset (images) in python

I'm working on a machine learning (IMAGE CLASSIFICATION) and I found a data set that has two files:

The images (20,000 images) "The images "The images are numbered from 1 to 20,000 (not classified into classes)"
A JSON file that has the information and classification of the images (12 classes of images) The JSON file is structured as follows:

{
  "<image_number>": {
    "image_filepath": "images/<image_number>.jpg", 
    "anomaly_class": "<class_name>"
  },
  ...
}

So I'm trying to read the JSON file and split the data set so I can deal with each class individually.. Then take 80% of "each class" as a training set and 20% for the testing set

I tried to find a way to match the JSON file with the dataset (images) So I can classify the classes in individual folders then divide them into training and testing sets

Anyone can help me with that?

THANK YOU

Solution

Something like the following would create folders for each of the classes and then move the images into them.

import json
import os
from os import path
# Open the json file containing the classifications
with open("clasification.json", "r") as f:
   classification = json.load(f)
# Create a set which contains all the classes
classes = set([i["anomaly_class"] for i in classification.values()])
# For each of the classes make a folder to contain them
for c in classes:
    os.makedirs(c)
# For each image entry in the json move the image to the folder named it's class
for image_number, image_data in classification.items():
    os.rename(image_data["image_filepath"], path.join(image_data["anomaly_class"], "{}.jpg".format(image_number)))