RunPod•16mo ago

Cold start time

Does anyone know the cold start time of model hosted in serverless runpod? Kandinsky model. 🙂

13 Replies

flash-singh•16mo ago

that is last week

rafael21@OP•16mo ago

thanks man. Are you using runpod Kandinsky endpoint or hosting it yourself in serverless?

flash-singh•16mo ago

that is the runpod endpoint

rafael21@OP•16mo ago

but is it Kandinsky 3.0?

flash-singh•16mo ago

rafael21@OP•16mo ago

hummmm.. I need 3.0

flash-singh•16mo ago

we don't have v3

rafael21@OP•16mo ago

cold start time will be higher if I host v3 myself in serverless?

flash-singh•16mo ago

its about the same, depends on workload

rafael21@OP•16mo ago

workload means amount of requests ? My web app would have few requests per day... like 5, 10 the cold start time can get higher than 3, 4 seconds?

flash-singh•16mo ago

yes true cold start, yes, you would have to measure worst xase for v3, for v2 seems to be around 12s

rafael21@OP•16mo ago

😧 12 seconds is a lot

J.•16mo ago

For something like this why not just chatgpt curious? But if u know a worker is potentially about to be pinged u could try to pre-start the worker with a warm up request and set idle time to 2 mins like if u see someone typing or open up a chat on ur web app send a prewarm request so worker is on and set it to idle for 3 mins for potential incoming requests Personally tho for the thing ur talking about i use runpod for heavier transcriptions / image gen etc. but if u need a fast llm response i found chatgpt is still the best at low volume workstreams before makes sense to host ur own

Gaming

Programming

Cold start time

Did you find this page helpful?