stupidity
stupidity
RRunPod
Created by stupidity on 2/21/2025 in #⚡|serverless
Runpod workers getting staggered when I call more then 1 at a time.
So i'm currently running an connected to the endpoint, and I've noticed that the workers tend to be deployed in a staggered way. That is I have a function that is splitting a workload into 50 runpod jobs. However I've noticed that for some reason, my endpoint does not actually use all 50 workers that I have that are ready. Instead it seems like the workers are getting staggered deployed that is I'll see that 36 of the jobs went through and are running and i still have 14 jobs in queue while I have 14 workers that are untouched? I've sent my server scaling to by request count with count set to 1 (should be the most aggressive way). I'm just stuck trying to resolve this because for the life of me I cant figure it out. Its causing what should be a 30 second tasks take over 1 minute ( as I have to wait for the staggered deployment and result) . Any one else have this issue or a recommendation. Thanks I appreciate it!
3 replies
RRunPod
Created by stupidity on 2/14/2025 in #⚡|serverless
Does Runpod serverless GPU's support NVIDIA MIG
Hello! I was wondering if anyone had any experience with setting up NVIDIA MIG (GPU partitioning) on runpod serverless? I'm currently trying to deploy a ~370 million parameters model onto serverless inference and we were trying to see if it would be possible to set up GPU partitioning on 1 worker to try and work around the serverless worker limitations. If any one has experience or even knows if Runpod support this would be much appreciated! thank you!
4 replies