where should I put my 30GB of models?
I'm trying to use https://github.com/blib-la/runpod-worker-comfy to make a serverless endpoint with a customized Docker image. In my case, I have a dozen custom nodes, which were easy to install using the Dockerfile (RUN clone, RUN python install requirements). But I also have 30GB of additional models that my ComfyUI install needs. The README suggests 2 different methods for deploying your own models: (1) copying/downloading them directly into the image during build (2) creating a network volume that gets mounted at runtime. But what are the pros/cons of each approach?
If I use a network volume, what are the speed implications? I'm just imagining trying to load 30GB on the fly over a home network -- it would take ages. On the other hand, if I design my workflows well, and ComfyUI keeps the models in memory, perhaps it's not that big of a deal? Also, how would I go about testing this locally? I'm assuming this is a well-documented task, but I'm not even sure what to Google for. I'm running Docker locally through WSL/Ubuntu.
So far, I have been COPYing the 30GB of models into the docker image during the build process and pushing it to Docker Hub. Surprisingly, my 78GB image pushed to Docker Hub with no complaints, and it's currently deploying to Runpod Serverless. But it is taking AGES to deploy. This will significantly slow down my dev process, but presumably the actual performance will be faster?
Thanks in advance.
GitHub
GitHub - blib-la/runpod-worker-comfy: ComfyUI as a serverless API o...
ComfyUI as a serverless API on RunPod. Contribute to blib-la/runpod-worker-comfy development by creating an account on GitHub.
8 Replies
I'm on same boat.
Commenting to follow up conversation.
Also, @jeffcrouse do you know if we can use any other container registery than docker hub?
Like I already have my images in aws container registery and azure as well, can we use them?
Sorry, I don't know about any other container registries. Are there advantages to the other ones? I was worried that Docker Hub would reject mine because I had read that they have 10GB/layer size limit, but I was pleased to find that it pushed successfully.
The fact that the Serverless "New Endpoint" form doesn't specifically mention DockerHub seems to suggest that it might be agnostic?
No, I just have credits on these cloud providers. Where I would need to pay for private repo in Docker Hub
ah, gotcha
After some experimenting, I can answer one of my own questions. One that is probably obvious if one is already familliar with Docker... It's easy to mount a directory from your local file system inside a Docker container, so this is how I can test locally. In fact, this is already set up in runpod-worker-comfy -- I just had to copy my models into data/runpod-volume and this is mounted via the docker-compose.yaml file
I spent all morning converting and testing the approach where I create a network storage volume, populate it with my models, and then attach it to a serverless endpoint, and I can report that it is plenty fast for me.
However, in the "Serverless > New Endpoint" dialog, as soon as I select a network volume, I can only choose the 80 GB GPU PRO worker configuration (the rest are "unavailable"), and only with Max Workers = 1. Is this a known limitation of attaching a network volume?
Yes, using network volume will lock you to that specific data center. You choice of GPUs and availability will be limited
@yhlong00000 Is there an easy way to view the current GPU availability at different data centers so that I can make a more informed choice about which data center I choose for my network volume?
Just go to create pods, or while creating a network volume, the page will show you GPU's availability on every datacenter
but no specific numbers, for security reasons
ok, thanks