Search code examples
pythoncsvpng

Extracting CSV row by row into separate PNG files in Python


Background I am working on a Neural Network and I want to use the EMNIST (Extended MNIST) dataset. Of which the link is: https://www.kaggle.com/datasets/crawford/emnist

However my program is build on retrieving it’s dataset in a certain manner: {program’s dir.} > {dataset name} > {train or test} > {class_label Ex: 5} > {filename}.png

The Problem The EMNIST dataset comes in .CSV format. That files contains the following:

  • Each row is a separate image
  • 785 columns
  • First column = class_label
  • Each column after represents a one pixel value (28 x 28 so 784 columns)

I want to make every single row a PNG file in it’s own class_label folder. And every of the same class_label should go in the same file.

The problem is that I have no idea how to do this or where I should begin seeing that I have never worked with CSV files.

So I am trying to find somebody willing to help me do this in Python so I can go on working on my project!

I have been looking around the internet for a solution to do it row by row but I have yet to find a good solution.


Solution

  • You can use PIL to help you with convert the row of numerical value into image. I hope this code below help:

    import csv
    import os
    from PIL import Image
    
    # Open the CSV file and read the rows
    with open('emnist.csv', 'r') as f:
        reader = csv.reader(f)
        rows = list(reader)
    
    # Iterate through each row
    for row in rows:
    
        class_label = row[0] # class label
        pixel_values = row[1:] # pixels
        
       
        if not os.path.exists(class_label):
            os.makedirs(class_label)
        
        # Create a 28x28 image using the pixel values
        img = Image.new('L', (28, 28))
        img.putdata(pixel_values)
        
        # Save the image to folder
        img.save(f'{class_label}/{class_label}_{rows.index(row)}.png')