RunPod•9mo ago

Serverless worker loading with stable diffusion pipeline

Hello, I am trying to create a serverless endpoint with a stable diffusion pipeline from the Diffusers library. I used the https://github.com/runpod-workers/worker-sdxl repository as a template to cache the model so it never has to re-download it from Huggingface after it initializes the docker image. However, whenever a request is sent to a newly initialized idle worker (i.e. hasn't processed any requests yet), it can take up to a minute for the pipeline to load even though the model is cached. Below are the last few lines of what this looks like in the logs when loading the pipeline in case that helps clarify what I mean: 97%|█████████████████████████████████████▉ | 865M/890M [00:13<00:00, 82.9MiB/s] 98%|██████████████████████████████████████▎| 873M/890M [00:13<00:00, 82.1MiB/s] 99%|██████████████████████████████████████▌| 881M/890M [00:13<00:00, 82.9MiB/s] 100%|██████████████████████████████████████▉| 889M/890M [00:13<00:00, 81.6MiB/s] After the worker has gone through this initial loading when it receives it's first request, it is fine for all subsequent requests and the delay time is very short. But it often happens that a request gets sent to a newly initialized idle worker and in that case, it takes way too long for a single image to be generated (usually between 1-2 minutes of loading when the actual generation time for an image is 5 seconds). Is there some way to prevent this loading from happening when a worker receives it's first request?

GitHub

GitHub - runpod-workers/worker-sdxl: RunPod worker for Stable Diffu...

RunPod worker for Stable Diffusion XL. Contribute to runpod-workers/worker-sdxl development by creating an account on GitHub.

0 Replies

No replies yetBe the first to reply to this messageJoin

Gaming

Programming

Serverless worker loading with stable diffusion pipeline