Search code examples
python-3.xmachine-learningdeep-learningpytorch

How To Import The MNIST Dataset From Local Directory Using PyTorch


I am writing a code of a well-known problem MNIST database of handwritten digits in PyTorch. I downloaded the train and testing dataset (from the main website) including the labeled dataset. The dataset format is t10k-images-idx3-ubyte.gz and after extract t10k-images-idx3-ubyte. My dataset folder looks like

MINST
 Data
  train-images-idx3-ubyte.gz
  train-labels-idx1-ubyte.gz
  t10k-images-idx3-ubyte.gz
  t10k-labels-idx1-ubyte.gz

Now, I wrote a code to load data like bellow

def load_dataset():
    data_path = "/home/MNIST/Data/"
    xy_trainPT = torchvision.datasets.ImageFolder(
        root=data_path, transform=torchvision.transforms.ToTensor()
    )
    train_loader = torch.utils.data.DataLoader(
        xy_trainPT, batch_size=64, num_workers=0, shuffle=True
    )
    return train_loader

My code is showing Supported extensions are: .jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp

How can I solve this problem and I also want to check that my images are loaded (just a figure contains the first 5 images) from the dataset?


Solution

  • Read this Extract images from .idx3-ubyte file or GZIP via Python

    Update

    You can import data using this format

    xy_trainPT = torchvision.datasets.MNIST(
        root="~/Handwritten_Deep_L/",
        train=True,
        download=True,
        transform=torchvision.transforms.Compose([torchvision.transforms.ToTensor()]),
    )
    

    Now, what is happening at download=True first your code will check at the root directory (your given path) contains any datasets or not.

    If no then datasets will be downloaded from the web.

    If yes this path already contains a dataset then your code will work using the existing dataset and will not download from the internet.

    You can check, first give a path without any dataset (data will be downloaded from the internet), and then give another path which already contains dataset data will not be downloaded.