optimize ComfyUI on serverless
I have ComfyUI deployed on runpod serverless, so I send the json workflows to runpod and receive the generated images in return. Right now, all my models are stored in a network volume. However, I read that loading the models from a network volume is not optimal.
In each workflow, I either use Stable Diffusion 1.5 or Stable Diffusion XL. My 1.5 and sdxl workflows always share some models (such as the checkpoint) but otherwise require different models with each request.
I am thinking about the following options to optimize further:
1. bake almost all the models, except the loras, into one docker image (about 30 GB)
2. build two different images, one for all the sdxl and one for all the 1.5 models
3. build two different images, one for 1.5 and one for sdxl. Only include the models into each image that I would use with every request (such as the 1.5 and sdxl checkpoint) and keep the rest in a network volume
Does someone have an idea what the best approach would be?
Thanks!
3 Replies
#2 is ur best shot
Actually I was reading some tests that runpod did internally and the load time from network to the worker seems < 4-6 seconds which tbh i think is acceptable - but depends how large this is compared to ur actual job execution time and response.
But the reason why u wanna not be on network storage is so u arent bottlenecked to a region in case that region availability drops
1) Ur image might get too big and if option 2 is the option better to do that
2) Seems reasonable and would create smaller end images
3) Any network storage will bottleneck u to the region ur in
ok thank you!
It would be nice if we could duplicate our network volumes across different regions to fix the bottleneck
This is an upcomign roadmap that they have!
#📌|roadmap
I know the team is working hard on object storage + also multiregion volume