maple
maple
RRunPod
Created by jojje on 1/26/2025 in #⚡|serverless
Stuck vLLM startup with 100% GPU utilization
7 replies
RRunPod
Created by maple on 4/9/2025 in #⛅|pods-clusters
vLLM Inconsistently Hangs at NCCL Initialization
we weren't and I think forcing 12.4 fixed the issue. Thanks!
3 replies
RRunPod
Created by jojje on 1/26/2025 in #⚡|serverless
Stuck vLLM startup with 100% GPU utilization
@Poddy
7 replies
RRunPod
Created by jojje on 1/26/2025 in #⚡|serverless
Stuck vLLM startup with 100% GPU utilization
No description
7 replies
RRunPod
Created by jojje on 1/26/2025 in #⚡|serverless
Stuck vLLM startup with 100% GPU utilization
Hi, is there any update on this issue? I am seeing this quite consistently, although not always, on A40 GPUs when using vLLM (not serverless)
7 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
Great, could you please let me know when this is rolled back?
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
at some point yesterday(?) ray init stopped working on both
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
@Alpay Ariyak More details I remember that may be helpful, I first started experiencing hanging ray init on EU-SE-1 A4000/A5000 instances. At the same time ray init was working fine on US-OR-1 A100 SXM instances
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
No description
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
No description
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
yes, it did
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
this is with multi-GPU setup on vllm
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
tried on multiple different GPUs
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
I am using the exact same commands and installation as just a few days ago, which worked fine
101 replies
RRunPod
Created by Mandragora.ai on 5/10/2024 in #⚡|serverless
Serverless broke for me overnight, I can't get inference to run at all.
Yes not sure if this is connected to serverless, but I have been doing dev work on vllm in a pod on the secure cloud. And within the last 1-2 days also have been stuck on Ray initialization/worker creation
101 replies