How to use Loras in SDXL serverless?
I don't see any docs regarding adding Loras in the workers for SDXL. I am assuming this is the worker that I should be using.
https://github.com/runpod-workers/worker-sdxl
GitHub
GitHub - runpod-workers/worker-sdxl: RunPod worker for Stable Diffu...
RunPod worker for Stable Diffusion XL. Contribute to runpod-workers/worker-sdxl development by creating an account on GitHub.
28 Replies
Maybe you can use a comfyui worker so you can just use a Lora comfyui workflow
GitHub
GitHub - blib-la/runpod-worker-comfy: ComfyUI as a serverless API o...
ComfyUI as a serverless API on RunPod. Contribute to blib-la/runpod-worker-comfy development by creating an account on GitHub.
I use this ^ it’s great, works in serverless. But u need to learn a bit about comfyui
Thanks a lot, you might have saved me a lot of time!
Solution
Yeah, the runpod SDXL worker doesn't support LoRA
maybe you need to add custom code to load loras from there
Yeah, not sure about the syntax or anything unfortunately.
@briefPeach Hey Brief, sorry about tagging you. I am wondering about the Dockerfile code, any idea how to add your own models and lora?
ADD models/checkpoints/sdxl.safetensors models/checkpoints/
ADD models/loras/sdxl.safetensors models/loras/
I swapped out the run with this code, which are my own models and lora, any idea if that's right?
Yeah! That's unfortunate, trying to work out another way using comfyui that brief recommended.
There is also this one that I use for my production applications - https://github.com/ashleykleynhans/runpod-worker-comfyui
GitHub
GitHub - ashleykleynhans/runpod-worker-comfyui: RunPod Serverless W...
RunPod Serverless Worker for the ComfyUI Stable Diffusion API - ashleykleynhans/runpod-worker-comfyui
It uses network storage so you mount your network storage on a pod and then install all of your custom nodes
Then you use a normal pod to create your workflows and send them to the endpoint.
Thanks a lot! I'll look into this right now, looks promising!
I am illiterate with comfyUI, I am assuming custom nodes is the same as throwing in your custom models and loras?
No, you use custom nodes to add nodes to your workflows to achieve different results.
Gotcha, thanks a bunch!
Hi to add/download lora, you need to do this in the dockerfile:
RUN wget -O models/loras/xl_more_art-full_v1.safetensors https://civitai.com/api/download/models/152309
OR if you want to add your lora file from your own computer without wget downloading, you need to
ADD relative/path/to/sdxl.safetensors /comfyui/models/loras/sdxl.safetensors
Make sure this file relative/path/to/sdxl.safetensors exists inside the runpod-worker-comfyui folder you pulled.
I think the preferred way is the first method (RUN wget)
@digigoblin I was also checking out this network volume method. I wonder how do you feel the launching and inference speed is? I assume it's a bit slower than loading everything from container disk without using network volume? Since network volume is not physical attached to your gpu machinewget is not the preferred way, its actually better to use COPY/ADD so that you don't need to download the model every single time you build your docker image
Also have you used network volume in scale? for example, if i have 5 serverless workers pointing to the same network volume, will the file i/o speed be ok?
yep abit lag
Yes, network volume disk is slower
interesting, how much slower do you feel like? (i know it's hard to quantize) but want to get a rough feeling like 10%, 50%? Right now, I'm using container disk, it's 2 - 5 seconds inference for the default comfyui text 2 image workflow (cheapest 16G CPU, A4000, inference only, assuming server is already boot up, model downloaded already)
I wonder how much inference time would i expect if using network volume
maybe 10% lol only abit that i feel
got it! then it's not too bad, acceptable I think. thank you! It's very useful information
oh yea this is a valid point. If you frequently rebuild your image, you should use ADD/COPY
I often build it multiple times while testing
you should benchmark yourself too to see if its too much
but for reusing data in multiple pods its great i think
Yeah I'm about to try to run that
but I worry multiple pods read/write into same network volume in parallel would make it even slower i/o 😂
i think the connection capacity is great so don't worry about that too much
thank you i'll give it a try!
sdxl worker is bassed on difusser format of sdxl
Ok i finally tried it and have some benchmarks using network volume in serverless
using default comfyui workflow text 2 image, sd1.5, model is already downloaded and saved in network volume comfyui/models folder
the first run is slowest, around 15sec pure inference + 2s uploading to s3 bucket (in total 17s as you can see in screenshot)
the subsequent runs are quicker, 2 - 3 seconds in total (without uploading to s3, it's around 1 - 2 sec)
the reason that the 1st run is slow is because it has to load the model into vram I guess?
i'm using 16GB cheapest GPU. So I think the speed looks ok!
Nice