I wrote the following code:
transform = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
train_dataset = torch.utils.data.TensorDataset(torch.from_numpy(X_train).float(), torch.from_numpy(y_train).float())
val_dataset = torch.utils.data.TensorDataset(torch.from_numpy(X_val).float(), torch.from_numpy(y_val).float())
# Define the dataloaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True, num_workers=4)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=32, shuffle=False, num_workers=4)
I would like to apply the transform compose to my dataset (X_train and X_val) which are both numpy array. How can I apply transform to augment my dataset and normalize it. Should I apply it before the model training or during model training?
You can simply add the transformers to your training pipeline:
train_transformed = transform(torch.from_numpy(X_train).float())
val_transformed = transform(torch.from_numpy(X_val).float())
train_dataset = torch.utils.data.TensorDataset(train_transformed, torch.from_numpy(y_train).float())
val_dataset = torch.utils.data.TensorDataset(val_transformed, torch.from_numpy(y_val).float())
This way the transform is applied during data loading which is before training.