RunPod•12mo ago

2 active workers on serverless endpoint keep rebooting

We have 2 active workers on a serverless endpoint, sometimes the workers reboot at the same time for some reason, which causes major problems in our system.

2024-04-03T14:37:16Z create pod network 2024-04-03T14:37:16Z create container endpoint-image:1.2 2024-04-03T14:37:17Z start container 2024-04-03T15:27:23Z stop container 2024-04-03T15:27:24Z remove container 2024-04-03T15:27:24Z remove network 2024-04-03T15:27:30Z create pod network 2024-04-03T15:27:30Z create container endpoint-image:1.2 2024-04-03T15:27:30Z start container 2024-04-03T17:34:51Z stop container 2024-04-03T17:34:51Z remove container 2024-04-03T17:34:51Z remove network

Has anyone ever had this problem? How to fix it?

Runpods version : 1.3.0 Docker Image : Python 3.11-slim Our image version : 1.2

8 Replies

Madiator2011•12mo ago

your serverless worker needs to have startup command and you just run plain python docker image

Captain BarbossaOP•12mo ago

Our Docker image already has a command to start with, should I add one anyway in our Runpods templates?

justin•12mo ago

Not sure if ur saying u had this api working before, and suddenly just these two workers these things happen, or if ur saying ur trying to deploy serverless. If the latter, ur trying to deploy, and running into this issue, as madiator said make sure ur calling specifically the handler.py which needs to have a runpod.start() call in the file to be triggered

justin•12mo ago

https://github.com/justinwlin/runpodWhisperx/blob/master/Dockerfile

GitHub

runpodWhisperx/Dockerfile at master · justinwlin/runpodWhisperx

Runpod WhisperX Docker Container Repo. Contribute to justinwlin/runpodWhisperx development by creating an account on GitHub.

justin•12mo ago

Are u doing so?

justin•12mo ago

https://blog.runpod.io/serverless-create-a-basic-api/ Ex. of runpod blog walking thro the setup

RunPod Blog

Serverless | Create a Custom Basic API

RunPod's Serverless platform allows for the creation of API endpoints that automatically scale to meet demand. The tutorial guides you through creating a basic worker and turning it into an API endpoint on the RunPod serverless platform. For this tutorial, we will create an API endpoint that helps us accomplish

Captain BarbossaOP•12mo ago

Thanks for the answer Yes I have a handler.py file with :

runpod.serverless.start({
            "handler" : do_something,
            "return_aggregate_stream" : True,
        })

runpod.serverless.start({
            "handler" : do_something,
            "return_aggregate_stream" : True,
        })

And in my dockerfile, I got this command:

CMD ["python", "-u", "handler.py"]

CMD ["python", "-u", "handler.py"]

Everyhitng works fines normally but now every X hours, the active worker reboots for no reason at all

flash-singh•12mo ago

active workers can shuffle, thats normal, there is no single active worker that is dedicated to being an active worker, its last man standing algorithm, its meant to optimize for cost

Gaming

Programming

2 active workers on serverless endpoint keep rebooting

Did you find this page helpful?