R
RunPod2mo ago
GENGHIS

VRAM stuck at 77% usage

VRAM usage stuck at 77% on 1 of my 4 GPUs. already restarted, hard stop, and start. and reset. i don't want to have to switch pods bc I have hundreds of GB of data on the volume that will take a long time to set up again. anything else i can do? tried reset. still stuck. ID: ox02c3pvm058j3
4 Replies
Jason
Jason2mo ago
can you check nvidia smi nvidia-smi --gpu-reset -i 0 doesn't fix that?
yhlong00000
yhlong000002mo ago
or look for any rogue processes (e.g., python, torch, tensorflow) that might be keeping memory occupied. If you see any, find their process IDs and kill them manually.
nvidia-smi
kill -9 <PID>
nvidia-smi
kill -9 <PID>
Jason
Jason2mo ago
yhlong00000
yhlong000002mo ago
open a support ticket we can take a look. At least for his pod, there’s nothing occupying VRAM before he starts, and after stopping the pod, everything is released as expected.

Did you find this page helpful?