R
RunPod2mo ago
sahir

queue delay times

Hi , I'm seeing really long delay times . even though there's nothing in the queue , and this is a really small CPU serverless endpoint . Any idea what causes this ?
No description
17 Replies
youssef
youssef2mo ago
I'm having same issue, on 16gb/24gb, my request stays a shit ton of time in the queue, these are only two items
No description
youssef
youssef2mo ago
cc @PRB - any issue on the statuspage ?
PRB
PRB2mo ago
@sahir which datacenter are you running this in? @youssef can you open a support ticket? we are already looking at this but will be nice to keep track and get back to you
sahir
sahirOP2mo ago
all locations were selected , so its made workers here
No description
sahir
sahirOP2mo ago
This is happening to my other endpoints too now
Kays
Kays2mo ago
Same here, almost 2 minutes cold start every time
Kays
Kays2mo ago
but once every few requests it goes <5 seconds again
No description
PRB
PRB2mo ago
are you guys on cpu endpoints or GPU endpoints? @Kays please reply so i can resolve faster
Kays
Kays2mo ago
GPU endpoints Its mostly A100 for me I mean h100*
PRB
PRB2mo ago
endpoint id will help
Kays
Kays2mo ago
pury32p7r6r4wf I can give you an example test request if you like
flash-singh
flash-singh2mo ago
your cold starts are high, are you loading model from network volume or is the model just too big?
Kays
Kays2mo ago
I'm not using network volumes, the model is flux-dev (24gb) But what's weird is that cold start sometimes is extremely quick, like under 5 seconds Hey there, any updates on this? Is it just the model being too big? @PRB @flash-singh thanks! seems to be fixed now somehow
flash-singh
flash-singh2mo ago
thats just flashboot, anything over 10s should be your ideal cold start, is your model baked into the container image?
Kays
Kays2mo ago
yes it is on the container right now I'm getting around 50/50 flashboots
flash-singh
flash-singh2mo ago
thats about right, depends on workload and capacity, for h100s thats really good if your p50 is hitting flashboot
Kays
Kays2mo ago
Cool, yes I’m happy with that rate, yesterday was more like 90-10 that’s why I mentioned it

Did you find this page helpful?