Should Tensorflow always be using the most Cuda cores it can?

I'm training a model right now and it's using almost all my VRAM but my CUDA usage is hovering between 10-20%. Is this normal? I know it's meant to use a lot of VRAM but I'm surprised it's not using a lot of Cuda cores.

Solution

By default tensorflow will allocate almost all GPU's VRAM, to limit that see here.

The GPU usage instead depends on the batch size (larger tends to utilize more GPU), size of the neural network (smaller models use less resource), layer kind (e.g. convolutions use the GPU more efficiently), and the data pipeline: if the way you move data from disk/RAM to GPU is a bottleneck, the GPU will wait for the new batch instead of doing processing.