I'm interested in watching the memory usage of a model being run under ollama.
How can I see the memory usage?
I do not know a way directly in Ollama, but you could get a rough estimate for this information from your graphics card, e.g. nvidia-smi
:
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1345 G /usr/lib/xorg/Xorg 378MiB |
| 0 N/A N/A 2490 G cinnamon 57MiB |
| 0 N/A N/A 3663 G ...ures=SpareRendererForSitePerProcess 25MiB |
| 0 N/A N/A 121270 G /usr/lib/firefox/firefox 160MiB |
| 0 N/A N/A 131205 C ...p/gguf/build/cuda/bin/ollama-runner 4868MiB |
+---------------------------------------------------------------------------------------+