RunPod•6mo ago

Active workers or Flex workers? - Stable Diffusion

I'm integrating Stable Diffusion into a mobile application where user prompts are sent to RunPod for image generation, with the results sent back to the app. The usage is highly variable, ranging from 15 to 100 image generations per day, and there may be days with no usage at all. Given this variability, should I opt for active workers or flex workers in RunPod for the most efficient scaling and cost management? And in my case, what is Flex workers/Active workers suitable for?

11 Replies

yhlong00000•6mo ago

Active workers help reduce cold start time but come with a cost, even with a 30% discount when idle. Based on your traffic of 100 images/day, I wouldn’t recommend using active workers unless you need super-fast responses. You can configure 3-5 max workers to handle traffic surges for scaling. By the way, where did you see ‘flex worker’? It’s an old term, and we should consider removing it to avoid confusion.😂

nerdylive•6mo ago

probably here: https://www.runpod.io/serverless-gpu

Serverless GPU Endpoints for AI Inference

Run machine learning inference at scale with RunPod Serverless GPU endpoints.

nerdylive•6mo ago

What is flex called now

yhlong00000•6mo ago

😂😂,it’s me luck history knowledge🥲, the flex worker pretty much mean every other type of worker except active worker

nerdylive•6mo ago

yea thats what i understand too

GABOP•6mo ago

Ahhhh so it's active worker, vs non-active. In my case, would the non-active worker scale to 0, upon zero traffic?

yhlong00000•6mo ago

Check this doc and let me know if you have more questions: https://docs.runpod.io/serverless/references/endpoint-configurations#active-min-workers

Endpoint configurations | RunPod Documentation

Configure your Endpoint settings to optimize performance and cost, including GPU selection, worker count, idle timeout, and advanced options like data centers, network volumes, and scaling strategies.

GABOP•6mo ago

I'm having trouble understanding the documentation. [Idle Timeout] The amount of time in seconds a worker not currently processing a job will remain active until it is put back into standby. During the idle period, your worker is considered running and will incur a charge. Does that mean, there is no option to do something like "charge per generation"? I was initially interested in on the fact that this obsolete "flex worker" has the ability to run only if required. And if no computation is used, there is 0 charges. "You will incur the cost of any active workers you have set regardless if they are working on a job." Does that mean, there is no such thing as... lets say, an API call from my app to Runpod. Which then uses a worker to generate an image. And then worker shuts down after image is generated?

nerdylive•6mo ago

idle timeout = "idle active" so it keeps the worker active for x amount of secs to keep the model warmer for faster starts yes just set it to 1 sec then, it will go off after 1 second if you got no other active workers

yhlong00000•6mo ago

A serverless worker runs per job, and once it’s done, it shuts off to save costs. The idle timeout keeps the worker running for a few extra seconds in case another request comes in right after, so you can avoid a cold start.

GABOP•6mo ago

Ahhh that's what I was hoping as an answer. Thank you! @nerdylive also helped explain about the timeouts, that will be amazing for my task. Thank you!

Gaming

Programming

Active workers or Flex workers? - Stable Diffusion

Did you find this page helpful?