I have some questions about memory usage in my program. When I run a file-reading program, the file size is only 100MB, but the memory usage of my process shows as 1.6GB (I haven't set any additional variables and have only imported the necessary library files). I understand that there are many other things that are also stored in memory during program execution, but I would like to know if there is any way to reduce them. The same thing happens when I transfer variables to the GPU, and my GPU shows usage of 600MB.
code
import numpy as np
import torch
import time
import struct
if __name__ == "__main__":
graphEdge = []
boundList = []
file_path = "./../../srcList.bin"
with open(file_path, 'rb') as file:
while True:
data = file.read(4)
if not data:
break
integer = struct.unpack('i', data)[0]
graphEdge.append(integer)
file_path = "./../../range.bin"
with open(file_path, 'rb') as file:
while True:
data = file.read(4)
if not data:
break
integer = struct.unpack('i', data)[0]
boundList.append(integer)
graphEdge = torch.Tensor(graphEdge).to(torch.int).to('cuda:0')
boundList = torch.Tensor(boundList).to(torch.int).to('cuda:0')
memory_size = graphEdge.element_size() * graphEdge.numel()
print(f"Tensor memory: {memory_size/(1024*1024)} MB")
A 32-bit number as a Python int
object takes 32 bytes. Plus 8 bytes for the list's reference to it. So 4 bytes in the file takes 40 bytes in memory as Python list+ints.
You commented that graphEdge
has 29,856,864 numbers, but only 589,563 different ones. So you have many duplicates and you can save a lot of memory by not storing separate int
objects with the same value. Use only one object per different value, and use that object repeatedly. One way to do that is with a dictionary that maps each value to its object. Do intern = {}.setdefault
at the start, and then append like this:
graphEdge.append(intern(integer, integer))
The dictionary of course also takes extra memory, but it saves a lot more than it costs. I estimate it'll overall take 800 MB less.
See interning at Wikipedia.
You could save even more if you didn't create a list
of int
objects but a Python or NumPy array
of 32-bit ints, but I'm not familiar with Torch and what it accepts as input.