R
RunPod2mo ago
anders

Understanding serverless & prising. Usecase: a1111 --api w. control on serverless

Right now im running some version of juggernautxl w controlnet, i spend about 20 seconds on generating 1 images, sometimes less on a5000. However, sometimes my reponse time on my endpoints are very slow. Im trying to figure out if this is purely because of cold start time, or because my requests is in the que before landing on a GPU. I guess my question is: -How to see when GPU is in que vs cold start, and is que time billed? How to control que time? -Does anyone have experience in reducing cold start time for a1111 serverless requests, and or maybe have setup some difffuser setup without a1111 for serverless that function really good? -If you can help me AWESOME. im shit, but my project is really cool.
1 Reply
nerdylive
nerdylive2mo ago
Fastboot works for reducing subsequent requests if your request aren't just so rare ( no time specified but if your requests are often it will be fast to load the model ) You can set a timeout for workers to be running if there is jobs waiting after a specified amount of time has passed in the queue ( check endpoint settings ) Maybe try comfy ui workers