Search code examples
pythonpytorchtraining-datadataloader

How to use DataLoader for PyTorch on iPython Console of Spyder


I check this Tutorial and can't figure out a way to actually use my DataLoader to train a ANN. When iterating over my DataLoader a cmd prompt pops up and immediately closes itself, afterwards nothing happens. My original data are both np.arrays.

enter image description here

import torch
from torch.utils import data
import numpy as np

class Dataset(data.Dataset):
  'Characterizes a dataset for PyTorch'
  def __init__(self, datax, labels):
        'Initialization'
        self.labels = torch.tensor(labels)
        self.datax = torch.tensor(datax)
        self.len = len(datax)

  def __len__(self):
        'Denotes the total number of samples'
        return self.len

  def __getitem__(self, index):
        'Generates one sample of data'
        # Load data and get label
        X = self.datax[index]
        y = self.labels[index]
        return X, y

params = {'batch_size': 64,
          'shuffle': True,
          'num_workers': 1}
training_set = Dataset(datax=X, labels=labels)
training_generator = data.DataLoader(training_set, **params)

for x in training_generator:
    print(1)

I tried many times and had a glimpse at the commandprompt which says something like

OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0𔂭
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 2 cores/pkg x 2 threads/core (2 total cores)
OMP: Info #214: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0 thread 0 
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 0 thread 1 
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 1 thread 0 
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 1 thread 1 
OMP: Info #250: KMP_AFFINITY: pid 10264 tid 2388 thread 0 bound to OS proc set 0
OMP: Info #250: KMP_AFFINITY: pid 10264 tid 3288 thread 1 bound to OS proc set 2

Solution

  • Here is how I do that:

    class myDataset(Dataset):
        '''
        a dataset for PyTorch 
        '''
        def __init__(self, X, y):
            self.X = X
            self.y = y
        def __getitem__(self, index):
            return self.X[index], self.y[index]
        def __len__(self):
            return len(self.X)
    

    then you can simple add to the loader:

    full_dataset = myDataset(X,y)
    train_loader = DataLoader(full_dataset, batch_size=batch_size)
    

    Also, X, y are just numpy arrays.

    And for the training you can access your data with a for loop:

    for data, target in train_loader:
            if train_on_gpu:
                data, target = data.double().cuda(), target.double().cuda()