anders
RRunPod
•Created by anders on 5/31/2024 in #⚡|serverless
All pods unavailable | help needed for future proof strategy
Region eu-se-1 has all pods unavailable for serverless.
I need to protect against this because SLA - it's hard because I litteraly don't know how or where to read about it - on Monday a 1000-2000 usd a month need is expected so would love help.
Maybe I am stupid, but I will have to look for alternatives I'm ofc stressed a bit. Hope you guys figure it out, and or can help me avoid and monitor this problem in the future.
-yes I can setup endpoint on all clouds, but truly I would need to set active worker to avoid this issue, which defeats the purpose of server less, unless I can predict the future and set active workers before others, but I don't want to have to program that algorithm.
17 replies
RRunPod
•Created by anders on 5/21/2024 in #⚡|serverless
are there any published information on 'up-time' - or tips on thinking of SLA type?
Basicly title, how to approach this? tips? writings? blogs? help a guy out
3 replies
RRunPod
•Created by anders on 5/9/2024 in #⚡|serverless
Understanding serverless & prising. Usecase: a1111 --api w. control on serverless
Right now im running some version of juggernautxl w controlnet, i spend about 20 seconds on generating 1 images, sometimes less on a5000. However, sometimes my reponse time on my endpoints are very slow. Im trying to figure out if this is purely because of cold start time, or because my requests is in the que before landing on a GPU. I guess my question is:
-How to see when GPU is in que vs cold start, and is que time billed? How to control que time?
-Does anyone have experience in reducing cold start time for a1111 serverless requests, and or maybe have setup some difffuser setup without a1111 for serverless that function really good?
-If you can help me AWESOME. im shit, but my project is really cool.
4 replies