Very slow cold start times
Does anyone know why I would get such variable cold start times, anything from half a second to 90 seconds? I'm using the standard vLLM Serverless template
2 Replies
It takes a while to load large models, especially from network storage and you only really benefit from flashboot if you send constant requests.
Makes sense, thank you