My serverless does not deploy the new releases
I was using 1.2.4 version of my image on an endpoint. I started a new release (1.2.5) but workers still works on 1.2.4. To be sure, I released 1.2.6 version and deployed. I have waited 3-4 hours between this version deployments. However, still it uses 1.2.4. What could be the problem?
Thanks.
33 Replies
Do you see Latest and Stale workers on your endpoint page when doing a new release?
Yes, I have 3 stale worker, and 1 extra worker.
When I start a new release, it says initializing.. however still uses the same image
Refresh your browser?
Who knows it's not updatign in your browser
Nah its not a browser issue, he said its still using the old image version, must be some serverless bug, I would log a support issue for this on the website.
There should be "Latest Workers" with the new image and there aren't any, so something is broken.
Yes, I agree with you. I deployed 1.2.5 yesterday. However I still have 1.2.4 on workers π
hmm try creating a new endpoint then?
btw, which region is it and what gpu model did you select?
mine works, i think its should not be a serverless bug
Maybe low availability of the required GPU type?
yeah i guess that
I have found that with :latest it will not update. Rather if you specify a specific version i.e. :1.01 when you change it to 1.02 (or other) the update will work. You cannot use :latest and assume it will update. It will not. You have to edit the template and provide a tag runpod has never seen.
I dont use latest tag.
I use the cheapiest worker option 16 GB GPU but it says high availability. However, the fact that it was not deployed even after 1 day seemed to be a problem.
*
You updated your template with a new tag? Like this?
I would create a new enpoint right now because my app is currently in development. However, if the same issue occurs on production, It would be problem π Because of that I'm trying to see if there is missing
why is the extra workers still initializing anyways
I don't know..
well check the logs
well, noone has been experiencing this asfar as i know
exept because of this
It says "Waiting for logs" for the extra worker
Have you tested your image locally?
Nope, I don't have GPU environment on my local π¦, but there is no dramatical change between 1.2.4 and 1.2.5
hmm
yeah worth reporting this to runpod i guess
try using the contact
I reported thank you. I will update here if there is a problem
Support suggested me to edit endpoint to set worker count to 0, then increase again. This way fixed the issue. Thank you π
You should not have to do that though
Yes, I think an issue occurred, but I just got rid of the problem
same way of deleting and recreating then
Thats a pain, easier just to set max workers to zero and back up again
not really for me if once, but not my point there...
What is your point? Its stupid to waste unneccessary time recreating endpoint, getting a different endpoint id, having to change your code that calls the endpoint to use the new endpoint id etc etc, its a stupid point to be honest
what point?
Both will create an unavailability anyway π So I think runpod is aware of the problem since they suggested this to reset like this
I had this issue occasionally. Just change the maximum or minimum number of worker or any param in the endpoint. It will trigger a new deployment
Like from 3 to 5 works?
Yes