Confusion with IDLE time
I have a serverless endpoint deployed with an idle timeout of 5 seconds. I expect that after 5 seconds, if I send a new request, the Docker image is downloaded again. Instead, the idle timeout is much longer. Even after minutes pass following sending a request again, the Docker image is already loaded, resulting in a very quick response which is good but i dont understand.
11 Replies
No, the docker image stays
except you create a new release
When you don't have any request it stays
Yes, we cache the image for serverless to ensure a quick response for users. However, if a large number of requests arrive suddenly, we might need to locate an available host and download the image.
I have a follow-up to this question.
Does that mean if I have a serverless I can leave if for days, months without being charged?
does it not aventually stop being 'ready'? And if it does, how much time does it take to then locate an available host and download the image again?
uhm, please tag me if there comes a reply to this.
We will reclaim resources if your endpoint isn’t used for a while. This means the images on the host will be cleaned up, and when you use the endpoint again after some time, you’ll notice that the image will need to be downloaded again.
How long is a while?
Like minutes, hours days?
I don't need exact time, but it would be useful with a ballpark estimate! 😛
I guess enough time if your app is alive
Like 1 daily users
At least
hello guys, I hope you are all doing well, may you please help me with a question as I am new to Runpod, I have been trying for last few days to deploy an llm from huggingface on Runpod but every time shortly after deployment runpod will automatically remove the container, may anyone please assist me on what I am doing wrong
Hey, did you use spot, if so, please use on demand
Remove container like what?
I just went to Runpod serverless then tried to use microsoft/Phi-3.5-mini-instruct from huggingface, but shortly after its says its read it will stop like ex below:
2024-09-26T06:52:26Z Status: Image is up to date for runpod/worker-v1-vllm:v1.4.0stable-cuda12.1.0
2024-09-26T06:52:26Z worker is ready
2024-09-26T06:52:26Z start container
2024-09-26T06:55:35Z stop container
2024-09-26T06:55:37Z remove container
What's the status ( worker status)
I think that's normal
If there is no request, the serverless will stop the worker and save the money for you. If you want to keep it running, set active worker