R
RunPod•2w ago
mambo no. 5

why the hell are my delay times so high and im bearing all the costs??

yesterday everything was working fine, delay times were a couple seconds. but now the delay times are getting ridiculous and IM being charged for the delay on top of the execution??
51 Replies
Dj
Dj•2w ago
We're currently resolving an incident that affects serverless job times. I will update you when I have news about extra spend as a result of the outage
mambo no. 5
mambo no. 5OP•2w ago
will we be reimbursed for the unnecessary extra spending in delay times? why are we even being charged for delay times? i was always under the impression im paying for the execution time
Dj
Dj•2w ago
I don't know for certain yet but once engineering talk of resolution in the thread is over I'll work on it. Worst case I'm capable of giving refunds myself, but I'd prefer an automated solution too 😅
mambo no. 5
mambo no. 5OP•2w ago
okay please provide an update here when you have more details
mambo no. 5
mambo no. 5OP•2w ago
if i load in $100 in credits and spin up 10 workers, what happens when my balance falls below $100? @Dj or what if i want to deploy a new endpoint? will i still be able to deploy 10 workers?
No description
Dj
Dj•2w ago
Yes, it's just a soft check at the time of registration once you press upgrade you're fine.
AdamOH
AdamOH•2w ago
We have a stable diffusion endpoint that has failed to boot up since the outage last night, even though the gpus are continually trying to boot up and start serving requests we're being charged for that gpu time even though its broken due to the outage that we're still trying to debug. this has been a major hit to our business!
Jason
Jason•2w ago
Hey runpod support can check that have you made any support request?
mambo no. 5
mambo no. 5OP•2w ago
exactly! we're having the same issue over here as well. i believe we deserve some form of reimbursement! it doesnt make sense for us to bear the costs for runpod's failure :( @Dj
Dj
Dj•2w ago
Hey, can you share your endpoint id with me so I can look into this? Hey, can you also share an endpoint id? I want to get people reimbursed but I can't do this until I know if I need to be doing it manually or if we're issuing them automatically. I'm following up with our engineering team now, but they're going to want to see affected user ids as well
mambo no. 5
mambo no. 5OP•2w ago
here's the id, u can take a look at earlier requests yesterday to find the problematic ones. some of them took 12 minutes+ when it's usually a couple seconds. proposed_emerald_fly
Jason
Jason•2w ago
No it's the name that you sent, Id looks like a random characters it's in your /run url
mambo no. 5
mambo no. 5OP•2w ago
oh this should be the one: u7hn1oucmnkkc5
Jason
Jason•2w ago
Yep seems that, now let dj check that
Dj
Dj•2w ago
Everything I see seems to be normal behavior for your workload, but I can only see the lifecycle of each worker (incoming request, pod started, job finished, pod stopped). You should be able to email support for help with receiving reimbursement.
mambo no. 5
mambo no. 5OP•2w ago
this is from yesterday when i made this thread. are you telling me the 5-12 minute delay times are normal? if so i think we're gonna have to re consider hosting on runpod
No description
Jason
Jason•2w ago
Can you check your endpoint logs, who knows you can see what's wrong with those worker
riverfog7
riverfog7•2w ago
depends on your model tho and cold start / fast boot
mambo no. 5
mambo no. 5OP•2w ago
Mate, look at the more recent requests in the pic. Usual delay time is 4-5s. Never over a minute and nothing close to 12 minutes
riverfog7
riverfog7•2w ago
what's ur model
mambo no. 5
mambo no. 5OP•2w ago
It just running a comfyui workflow for wan video gen I’m telling you it’s not about the model. I’ve ran the exact same workflow over the past week and never seen anything remotely close to 12 minutes I’m still handling requests today and the delay time is nowhere near a minute even
riverfog7
riverfog7•2w ago
maybe logs will help debugging?
Jason
Jason•2w ago
That's why I'm telling you to only check the logs if possible..
riverfog7
riverfog7•2w ago
yeah
Jason
Jason•2w ago
Especially in that time, and that specific worker
riverfog7
riverfog7•2w ago
from a dev's perspective only info they(i mean people here) have is 1. pods are sometimes taking longer to load done you can't debug with that
Jason
Jason•2w ago
But as dj said you can create a support ticket or email support for reimbursement request
riverfog7
riverfog7•2w ago
yeah but if you want to debug together we need logs
mambo no. 5
mambo no. 5OP•2w ago
How do I get the logs for those ones? It’s disappeared from the requests tab
Jason
Jason•2w ago
Is there a logs tab? Not in requests tab
mambo no. 5
mambo no. 5OP•2w ago
I can’t find them anymore, they’ve been buried under multiple other requests :( Anyways the bottom line is, will we all be getting reimbursed or not?
Jason
Jason•2w ago
I think the best way to get that answer is ask in a support request / ticket I'm just trying to see what's the problem from the logs if possible, it's fine if you cant find them anymord
riverfog7
riverfog7•2w ago
my thought about the delay times are: 1. the 4 to 5 sec delay time you had b4 was a result of runpod's fast boot feature which essentially keeps the model loaded in VRAM. 2. the 5 min delay time was probably caused by cold start having 1. The idle timeout in serverless settings 2. the image u r using 3. the model 4. the interval you send the requests 5. hopefully the logs if possible might help debugging possible causes of high delay are: 1. idle timeout is too low and it causes workers to do a cold boot every time (or you send requests one at a time.) 2. if u r using a non-official image that may not be cached at the host and cause high boot time 3. runpod's network volume has speed issues 4. CUDA Memory leak (the worker could die after processing one request)
Jason
Jason•2w ago
Images won't be re downloaded as long as your worker stays idle, and if your worker is initializing and if counts as delay time it means your endpoint is new So feel free to eliminate That one
mambo no. 5
mambo no. 5OP•2w ago
yeah i have an email ticket open with them but they haven't been very vocal with their responses
Jason
Jason•2w ago
What does vocal means?
mambo no. 5
mambo no. 5OP•2w ago
they didn't provide any meaningful information other than just saying they have fixed the outage @Dj can you confirm if we are even supposed to pay for delay times or just execution times? there is no information on this at all if i have a request which has a delay of 2 minutes and execution of 2 minutes, do i pay for 2 or 4 minutes?
Dj
Dj•2w ago
You're not paying for delay time, delay time is stuff like how long it takes the image to download and start, that's on us, execution time is how long it takes the model to load and actually do the thing. For that example, 2m
Jason
Jason•2w ago
Delay time can be charged too, thing is you will be charged when worker is running
mambo no. 5
mambo no. 5OP•2w ago
?? who is right here do we need to bring the ceo in
Jason
Jason•2w ago
What is charged is only when your worker is running It can be delay time or execution time
mambo no. 5
mambo no. 5OP•2w ago
@Dj ^ is that true or not? if it's true then how do i even quantify how much im paying
Jason
Jason•2w ago
Wait now model loading is on execution time?
Dj
Dj•2w ago
Candidly I'm not the best source of information on this, but my understanding is it depends on how your worker is setup. It should be delay time to load a model but I'm pretty sure you can do it "wrong" and load your model on request. Technically nothing stops you from shooting yourself in the foot, any code inside the handler function which is literally responsible for responding to your request is run time.
Jason
Jason•2w ago
Hmm did you ask for reimbursement?
Dj
Dj•2w ago
If you want me to tak ea look at your template and help you understand your delay time I can skim it over now, but it's 2am on a weekend so providing full support is slightly out of my scope at this time. I'm happy to answer questions, etc but fixing a template for you is something I'd rather do on Monday 😛
Jason
Jason•2w ago
Yeah, usually it's delay time, but usually and from the docs, they recommend to put model loading outside handler
mambo no. 5
mambo no. 5OP•2w ago
yeah i get where you're at rn haha it's 3am for me yep we'll see what they respond on monday]
Jason
Jason•2w ago
Ohh okay
Dj
Dj•2w ago
Support was directed to provide reimbursement for the length of the outage iirc (27 minutes) and it was confirmed Pods were unaffected, only serverless users (like you!)
Jason
Jason•2w ago
Basically whatever happens before you call runpod.Serverless.Start() is the delay time that is charged (because worker is running already) From your docker file's entrypoint or cmd

Did you find this page helpful?