zethos
zethos
RRunPod
Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
yeah, also dockerhub has size limit of 100GB, so I cannot put modelfiles inside docker and upload to dockerhub
18 replies
RRunPod
Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
Can you tell me the exact template? Or, you were refering to the vLLM template?
18 replies
RRunPod
Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
yeah, thanks.
18 replies
RRunPod
Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
yeah, previously I tried with many whisper and Bert based models in a single docker image itself and it worked. May be beacuse of docker image size is small. Hardly 20 GB max.
18 replies
RRunPod
Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
Yes, this is what I am going to do. The model has around 51 files with total siz around 240 GB. I am thinking of building the docker image with the whole 245 GB of files inside using the Option 2 montioned here: https://github.com/runpod-workers/worker-vllm?tab=readme-ov-file#option-2-build-docker-image-with-model-inside Do you think it will be too much 245 GB + take around 20 GB for Ubuntu and cuda drivers.
18 replies
RRunPod
Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
yes. I increased the execution timeout and it works for some time. And then when worker goes idle, again it needs to load the models and a 15-20 misn wait.
18 replies
RRunPod
Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
@nerdylive Could you please help? I tried loading the model to network volume using a pod and then attach the network volume to the serverless instaance. Still, its taking time to load.
18 replies
RRunPod
Created by zethos on 1/13/2024 in #⚡|serverless
#How to upload a file using a upload api in gpu serverless?
Thank you for confirmation. Before trying these two approaches, I was checking if direct upload would be possible.
4 replies