I check this Tutorial and can't figure out a way to actually use my DataLoader to train a ANN. When iterating over my DataLoader a cmd prompt pops up and immediately closes itself, afterwards nothing happens. My original data are both np.arrays.
import torch
from torch.utils import data
import numpy as np
class Dataset(data.Dataset):
'Characterizes a dataset for PyTorch'
def __init__(self, datax, labels):
'Initialization'
self.labels = torch.tensor(labels)
self.datax = torch.tensor(datax)
self.len = len(datax)
def __len__(self):
'Denotes the total number of samples'
return self.len
def __getitem__(self, index):
'Generates one sample of data'
# Load data and get label
X = self.datax[index]
y = self.labels[index]
return X, y
params = {'batch_size': 64,
'shuffle': True,
'num_workers': 1}
training_set = Dataset(datax=X, labels=labels)
training_generator = data.DataLoader(training_set, **params)
for x in training_generator:
print(1)
I tried many times and had a glimpse at the commandprompt which says something like
OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0ð”‚
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 2 cores/pkg x 2 threads/core (2 total cores)
OMP: Info #214: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0 thread 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 0 thread 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 1 thread 0
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 1 thread 1
OMP: Info #250: KMP_AFFINITY: pid 10264 tid 2388 thread 0 bound to OS proc set 0
OMP: Info #250: KMP_AFFINITY: pid 10264 tid 3288 thread 1 bound to OS proc set 2
Here is how I do that:
class myDataset(Dataset):
'''
a dataset for PyTorch
'''
def __init__(self, X, y):
self.X = X
self.y = y
def __getitem__(self, index):
return self.X[index], self.y[index]
def __len__(self):
return len(self.X)
then you can simple add to the loader:
full_dataset = myDataset(X,y)
train_loader = DataLoader(full_dataset, batch_size=batch_size)
Also, X, y are just numpy arrays.
And for the training you can access your data with a for loop:
for data, target in train_loader:
if train_on_gpu:
data, target = data.double().cuda(), target.double().cuda()