Search code examples
pythonpytorchgoogle-colaboratorydataloaderpytorch-geometric

torch-geometric dataloader "RuntimeError: DataLoader worker (pid(s) 5190) exited unexpectedly"


I am trying to train a GNN with a torch-geometric on google colab. This involves iterating through the batches

train_dataset = OneStepDataset(data_path, "train", noise_std=params["noise"])
train_loader = pyg.loader.DataLoader(train_dataset, batch_size=params["batch_size"], shuffle=True, pin_memory=True, num_workers=2)
for batch in train_loader:
    print(batch)

(OneStepDataset is a subclass of pyg.data.Dataset() that gets me graph data in pyg.data.Data() dtype)

I get this error:

RuntimeError: DataLoader worker (pid(s) 5190) exited unexpectedly

The common solutions I found are to decrease batch size or to set num_workers=1 but they either didn't work or was only an option on the original pytorch's dataloader. Is there any solutions to this?


Solution

  • The problem is solved by changing runtime accelerator to a TPU but I had to modify the pytorch code to be TPU compatible with the PyTorch/XLA package. I basically followed the example here: https://colab.research.google.com/github/pytorch/xla/blob/master/contrib/colab/getting-started.ipynb