GPU usage when pod initialized. Not able to clear.

Tried nvidia-smi -r, restarting, and reseting. There is still usage on one gpu in the pod.
4 Replies
zkreutzjanz
zkreutzjanz4mo ago
No description
Finley
Finley4mo ago
@zkreutzjanz I've noted the pod/machine ID so we can take a look internally, best thing to do would be to rent another set of GPUs in the meantime if one is available. Sorry for the incovenience
zkreutzjanz
zkreutzjanz4mo ago
@Finley FYI I am still renting this pod. Can't take it down due to needing a bunch of A6000s on the same machine. The extra memory usage is still there.
Want results from more Discord servers?
Add your server
More Posts
Chat History, Memory and MessagesI have a general usage question for runpod; when I open the web UI, under Parameters > Chat history Increase number of GPU-s in the existing pod?Hey folks, I have an existing pod with 2 A100 GPUs. I want to add two more to it. Is it possible? I How do i restart worker automatically or using some script?I have used this GitHub repository at https://github.com/runpod-workers/worker-template to implemet Keeping reverse proxy hostname between destroy/startHello, I'm using network storage for my pod. My use case doesn't require the container to be up 24/Inconsistent performance of local runpods and production runpodsI tested my docker image locally using runpod and it worked fine, but after I uploaded the image to base_image in dockerless serverlesswhen "runpodctl project dev", use runpod/pytorch:2.0.1-py3.10-cuda11.8.0-devel-ubuntu22.04 as base_iServerless After Docker Image Pull, Error (failed to register layer: Container ID 197609 cannot...)detail system logs ```plaintext 2024-03-13T07:42:28Z ffd6a93e8d50 Extracting [=====================Cuda 12.0 version template is missingCan you please let me know on how to get Pytorch 2.0.1 and Cuda 12.0, this template i was using for Failed Serverless Jobs drain Complete BalanceHi, just like this GitHub issue (https://github.com/runpod-workers/worker-vllm/issues/29), I had myNot able to connect to Web Terminal after increasing the container disk size of the podI created a GPU pod, and was able to connect to Web Terminal fine. However, due to disk space issue