R
RunPod8mo ago
K/S

How to use Loras in SDXL serverless?

I don't see any docs regarding adding Loras in the workers for SDXL. I am assuming this is the worker that I should be using. https://github.com/runpod-workers/worker-sdxl
GitHub
GitHub - runpod-workers/worker-sdxl: RunPod worker for Stable Diffu...
RunPod worker for Stable Diffusion XL. Contribute to runpod-workers/worker-sdxl development by creating an account on GitHub.
Solution:
Yeah, the runpod SDXL worker doesn't support LoRA
Jump to solution
28 Replies
briefPeach
briefPeach8mo ago
Maybe you can use a comfyui worker so you can just use a Lora comfyui workflow
briefPeach
briefPeach8mo ago
GitHub
GitHub - blib-la/runpod-worker-comfy: ComfyUI as a serverless API o...
ComfyUI as a serverless API on RunPod. Contribute to blib-la/runpod-worker-comfy development by creating an account on GitHub.
briefPeach
briefPeach8mo ago
I use this ^ it’s great, works in serverless. But u need to learn a bit about comfyui
K/S
K/SOP8mo ago
Thanks a lot, you might have saved me a lot of time!
Solution
digigoblin
digigoblin8mo ago
Yeah, the runpod SDXL worker doesn't support LoRA
nerdylive
nerdylive8mo ago
maybe you need to add custom code to load loras from there
K/S
K/SOP8mo ago
Yeah, not sure about the syntax or anything unfortunately. @briefPeach Hey Brief, sorry about tagging you. I am wondering about the Dockerfile code, any idea how to add your own models and lora? ADD models/checkpoints/sdxl.safetensors models/checkpoints/ ADD models/loras/sdxl.safetensors models/loras/ I swapped out the run with this code, which are my own models and lora, any idea if that's right? Yeah! That's unfortunate, trying to work out another way using comfyui that brief recommended.
digigoblin
digigoblin8mo ago
There is also this one that I use for my production applications - https://github.com/ashleykleynhans/runpod-worker-comfyui
GitHub
GitHub - ashleykleynhans/runpod-worker-comfyui: RunPod Serverless W...
RunPod Serverless Worker for the ComfyUI Stable Diffusion API - ashleykleynhans/runpod-worker-comfyui
digigoblin
digigoblin8mo ago
It uses network storage so you mount your network storage on a pod and then install all of your custom nodes Then you use a normal pod to create your workflows and send them to the endpoint.
K/S
K/SOP8mo ago
Thanks a lot! I'll look into this right now, looks promising! I am illiterate with comfyUI, I am assuming custom nodes is the same as throwing in your custom models and loras?
digigoblin
digigoblin8mo ago
No, you use custom nodes to add nodes to your workflows to achieve different results.
K/S
K/SOP8mo ago
Gotcha, thanks a bunch!
briefPeach
briefPeach8mo ago
Hi to add/download lora, you need to do this in the dockerfile: RUN wget -O models/loras/xl_more_art-full_v1.safetensors https://civitai.com/api/download/models/152309 OR if you want to add your lora file from your own computer without wget downloading, you need to ADD relative/path/to/sdxl.safetensors /comfyui/models/loras/sdxl.safetensors Make sure this file relative/path/to/sdxl.safetensors exists inside the runpod-worker-comfyui folder you pulled. I think the preferred way is the first method (RUN wget) @digigoblin I was also checking out this network volume method. I wonder how do you feel the launching and inference speed is? I assume it's a bit slower than loading everything from container disk without using network volume? Since network volume is not physical attached to your gpu machine
digigoblin
digigoblin8mo ago
wget is not the preferred way, its actually better to use COPY/ADD so that you don't need to download the model every single time you build your docker image
briefPeach
briefPeach8mo ago
Also have you used network volume in scale? for example, if i have 5 serverless workers pointing to the same network volume, will the file i/o speed be ok?
nerdylive
nerdylive8mo ago
yep abit lag
digigoblin
digigoblin8mo ago
Yes, network volume disk is slower
briefPeach
briefPeach8mo ago
interesting, how much slower do you feel like? (i know it's hard to quantize) but want to get a rough feeling like 10%, 50%? Right now, I'm using container disk, it's 2 - 5 seconds inference for the default comfyui text 2 image workflow (cheapest 16G CPU, A4000, inference only, assuming server is already boot up, model downloaded already) I wonder how much inference time would i expect if using network volume
nerdylive
nerdylive8mo ago
maybe 10% lol only abit that i feel
briefPeach
briefPeach8mo ago
got it! then it's not too bad, acceptable I think. thank you! It's very useful information oh yea this is a valid point. If you frequently rebuild your image, you should use ADD/COPY
digigoblin
digigoblin8mo ago
I often build it multiple times while testing
nerdylive
nerdylive8mo ago
you should benchmark yourself too to see if its too much but for reusing data in multiple pods its great i think
briefPeach
briefPeach8mo ago
Yeah I'm about to try to run that but I worry multiple pods read/write into same network volume in parallel would make it even slower i/o 😂
nerdylive
nerdylive8mo ago
i think the connection capacity is great so don't worry about that too much
briefPeach
briefPeach8mo ago
thank you i'll give it a try!
Madiator2011
Madiator20118mo ago
sdxl worker is bassed on difusser format of sdxl
briefPeach
briefPeach8mo ago
Ok i finally tried it and have some benchmarks using network volume in serverless using default comfyui workflow text 2 image, sd1.5, model is already downloaded and saved in network volume comfyui/models folder the first run is slowest, around 15sec pure inference + 2s uploading to s3 bucket (in total 17s as you can see in screenshot) the subsequent runs are quicker, 2 - 3 seconds in total (without uploading to s3, it's around 1 - 2 sec) the reason that the 1st run is slow is because it has to load the model into vram I guess? i'm using 16GB cheapest GPU. So I think the speed looks ok!
No description
nerdylive
nerdylive8mo ago
Nice
Want results from more Discord servers?
Add your server