nielsrolf
RRunPod
•Created by nielsrolf on 11/12/2024 in #⚡|serverless
Incredibly long startup time when running 70b models via vllm
The other thing where it gets stuck on frequently is:
Yesterday I was told that this might be due to issues with the model itself, but it now happened with different models and sometimes the models later worked.
11 replies
RRunPod
•Created by nielsrolf on 11/12/2024 in #⚡|serverless
Incredibly long startup time when running 70b models via vllm
11 replies
RRunPod
•Created by nielsrolf on 11/12/2024 in #⚡|serverless
Incredibly long startup time when running 70b models via vllm
Ok this is what I got told when I opened a support ticket yesterday, but then I will remove that again
11 replies
RRunPod
•Created by nielsrolf on 11/12/2024 in #⚡|serverless
Incredibly long startup time when running 70b models via vllm
Yes it would indeed be better if that wasn't necessary, but this is how the vllm-worker appears to be implemented. I could live with a long start up time because I mostlty want to do batch requests, but if you know how to deploy the vllm template with preloaded model then I'd gladly use that
11 replies
RRunPod
•Created by nielsrolf on 11/12/2024 in #⚡|serverless
Incredibly long startup time when running 70b models via vllm
Thanks, it now says Ticket created
11 replies
RRunPod
•Created by nielsrolf on 11/12/2024 in #⚡|serverless
Incredibly long startup time when running 70b models via vllm
I think my main issue is the same as https://github.com/runpod-workers/worker-vllm/issues/112
11 replies