RunPod•4mo ago

I am not using GPU, but someone else is occupying my GPU. What is the solution?

ID: xx5vmcdbbkab3m A100 *6 / 1 week service. When I first initialized it, it showed that someone else was occupying my GPU. How should I handle this? And I am unable to use two GPUs. How does the refund process work in this case?

10 Replies

살려주세요OP•4mo ago

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 11.74 GiB. GPU 3 has a total capacty of 79.14 GiB of which 1.85 GiB is free. Process 1045963 has 32.13 GiB memory in use. Process 3802704 has 45.14 GiB memory in use. Of the allocated memory 42.12 GiB is allocated by PyTorch, and 1.59 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CON

살려주세요OP•4mo ago