Very slow cold start times

Does anyone know why I would get such variable cold start times, anything from half a second to 90 seconds? I'm using the standard vLLM Serverless template
2 Replies
digigoblin
digigoblin8mo ago
It takes a while to load large models, especially from network storage and you only really benefit from flashboot if you send constant requests.
MattArgentina
MattArgentinaOP8mo ago
Makes sense, thank you

Did you find this page helpful?