python deep-learning memory-management pytorch

How to solve ' OutOfMemoryError: CUDA out of memory' in pytorch?

In Colab I am predicting array of 2448x2448 with 7 classes with trained model(input= (2448, 2448, 3) and output= (2448, 2448, 7).

for idx in range(len(test_dataset)):

image, gt_mask = test_dataset[idx]
image_vis = test_dataset_vis[idx][0].astype('uint8')
x_tensor = torch.from_numpy(image).to(DEVICE).unsqueeze(0)
# Predict test image
pred_mask = best_model(x_tensor)
pred_mask = pred_mask.detach().squeeze().cpu().numpy()
# Convert pred_mask from `CHW` format to `HWC` format
pred_mask = np.transpose(pred_mask,(1,2,0))
# Get prediction channel corresponding to foreground
pred_urban_land_heatmap = pred_mask[:,:,select_classes.index('urban_land')]
pred_mask = colour_code_segmentation(reverse_one_hot(pred_mask), select_class_rgb_values)
# Convert gt_mask from `CHW` format to `HWC` format
gt_mask = np.transpose(gt_mask,(1,2,0))
gt_mask = colour_code_segmentation(reverse_one_hot(gt_mask), select_class_rgb_values)
cv2.imwrite(os.path.join(sample_preds_folder, f"sample_pred_{idx}.png"), np.hstack([image_vis, gt_mask, pred_mask])[:,:,::-1])

visualize(
    original_image = image_vis,
    ground_truth_mask = gt_mask,
    predicted_mask = pred_mask,
    pred_urban_land_heatmap = pred_urban_land_heatmap
)

But I get

OutOfMemoryError: CUDA out of memory. Tried to allocate 366.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 357.06 MiB is free. Process 224843 has 14.40 GiB memory in use. Of the allocated memory 13.94 GiB is allocated by PyTorch, and 344.24 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

I have GPU of 15 GBs in colab, but when I reach to this line

pred_mask = best_model(x_tensor)

the allocated memory of GPU spikes to top in the allocation graph.

Solution

Your model is too big or your input is too big. You do not have much choice. Use a smaller model or use smaller inputs. 2448x2448x3 is usually a very big array for most networks. If you work with images they often take images like 224x224 or 512x512 as input, so you need to resize or do tiling.