RunPod•11mo ago

How to use Loras in SDXL serverless?

I don't see any docs regarding adding Loras in the workers for SDXL. I am assuming this is the worker that I should be using. https://github.com/runpod-workers/worker-sdxl

GitHub

GitHub - runpod-workers/worker-sdxl: RunPod worker for Stable Diffu...

RunPod worker for Stable Diffusion XL. Contribute to runpod-workers/worker-sdxl development by creating an account on GitHub.

Solution:

Yeah, the runpod SDXL worker doesn't support LoRA

Jump to solution

28 Replies

briefPeach•11mo ago

Maybe you can use a comfyui worker so you can just use a Lora comfyui workflow

briefPeach•11mo ago

https://github.com/blib-la/runpod-worker-comfy

GitHub

GitHub - blib-la/runpod-worker-comfy: ComfyUI as a serverless API o...

ComfyUI as a serverless API on RunPod. Contribute to blib-la/runpod-worker-comfy development by creating an account on GitHub.

briefPeach•11mo ago

I use this ^ it’s great, works in serverless. But u need to learn a bit about comfyui

K/SOP•11mo ago

Thanks a lot, you might have saved me a lot of time!

Solution

digigoblin•11mo ago

Yeah, the runpod SDXL worker doesn't support LoRA

nerdylive•11mo ago

maybe you need to add custom code to load loras from there

K/SOP•11mo ago

Yeah, not sure about the syntax or anything unfortunately. @briefPeach Hey Brief, sorry about tagging you. I am wondering about the Dockerfile code, any idea how to add your own models and lora? ADD models/checkpoints/sdxl.safetensors models/checkpoints/ ADD models/loras/sdxl.safetensors models/loras/ I swapped out the run with this code, which are my own models and lora, any idea if that's right? Yeah! That's unfortunate, trying to work out another way using comfyui that brief recommended.

digigoblin•11mo ago

There is also this one that I use for my production applications - https://github.com/ashleykleynhans/runpod-worker-comfyui

GitHub

GitHub - ashleykleynhans/runpod-worker-comfyui: RunPod Serverless W...

RunPod Serverless Worker for the ComfyUI Stable Diffusion API - ashleykleynhans/runpod-worker-comfyui

digigoblin•11mo ago

It uses network storage so you mount your network storage on a pod and then install all of your custom nodes Then you use a normal pod to create your workflows and send them to the endpoint.

K/SOP•11mo ago

Thanks a lot! I'll look into this right now, looks promising! I am illiterate with comfyUI, I am assuming custom nodes is the same as throwing in your custom models and loras?

digigoblin•11mo ago

No, you use custom nodes to add nodes to your workflows to achieve different results.

K/SOP•11mo ago

Gotcha, thanks a bunch!

briefPeach•11mo ago

Hi to add/download lora, you need to do this in the dockerfile: RUN wget -O models/loras/xl_more_art-full_v1.safetensors https://civitai.com/api/download/models/152309 OR if you want to add your lora file from your own computer without wget downloading, you need to ADD relative/path/to/sdxl.safetensors /comfyui/models/loras/sdxl.safetensors Make sure this file relative/path/to/sdxl.safetensors exists inside the runpod-worker-comfyui folder you pulled. I think the preferred way is the first method (RUN wget) @digigoblin I was also checking out this network volume method. I wonder how do you feel the launching and inference speed is? I assume it's a bit slower than loading everything from container disk without using network volume? Since network volume is not physical attached to your gpu machine

digigoblin•11mo ago

wget is not the preferred way, its actually better to use COPY/ADD so that you don't need to download the model every single time you build your docker image

briefPeach•11mo ago

Also have you used network volume in scale? for example, if i have 5 serverless workers pointing to the same network volume, will the file i/o speed be ok?

nerdylive•11mo ago

yep abit lag

digigoblin•11mo ago

Yes, network volume disk is slower

briefPeach•11mo ago

interesting, how much slower do you feel like? (i know it's hard to quantize) but want to get a rough feeling like 10%, 50%? Right now, I'm using container disk, it's 2 - 5 seconds inference for the default comfyui text 2 image workflow (cheapest 16G CPU, A4000, inference only, assuming server is already boot up, model downloaded already) I wonder how much inference time would i expect if using network volume

nerdylive•11mo ago

maybe 10% lol only abit that i feel

briefPeach•11mo ago

got it! then it's not too bad, acceptable I think. thank you! It's very useful information oh yea this is a valid point. If you frequently rebuild your image, you should use ADD/COPY

digigoblin•11mo ago

I often build it multiple times while testing

nerdylive•11mo ago

you should benchmark yourself too to see if its too much but for reusing data in multiple pods its great i think

briefPeach•11mo ago

Yeah I'm about to try to run that but I worry multiple pods read/write into same network volume in parallel would make it even slower i/o 😂

nerdylive•11mo ago

i think the connection capacity is great so don't worry about that too much

briefPeach•11mo ago

thank you i'll give it a try!

Madiator2011•11mo ago

sdxl worker is bassed on difusser format of sdxl

briefPeach•11mo ago

Ok i finally tried it and have some benchmarks using network volume in serverless using default comfyui workflow text 2 image, sd1.5, model is already downloaded and saved in network volume comfyui/models folder the first run is slowest, around 15sec pure inference + 2s uploading to s3 bucket (in total 17s as you can see in screenshot) the subsequent runs are quicker, 2 - 3 seconds in total (without uploading to s3, it's around 1 - 2 sec) the reason that the 1st run is slow is because it has to load the model into vram I guess? i'm using 16GB cheapest GPU. So I think the speed looks ok!

nerdylive•11mo ago

Nice

Gaming

Programming

How to use Loras in SDXL serverless?

Did you find this page helpful?