Worker is very frequently killed and replaced
I have an endpoint configured with 1 active worker and 2 max workers (24GB PRO).
The requests are being handled by an asynchronous handler.
For some unknown reason -- I can't see any errors or other failures in the logs, every 30 min - 2h (some times less, sometimes more), the worker restarts. Same worker (according to the id), but the container is restarted.
What could be the reason for this?
The system logs look like this:
2024-03-01T17:08:05Z start container
2024-03-01T17:18:07Z stop container
2024-03-01T17:18:08Z remove container
2024-03-01T17:18:08Z remove network
2024-03-01T18:20:22Z create pod network
2024-03-01T18:20:22Z create container XXXXX
2024-03-01T18:20:22Z start container
2024-03-01T18:30:17Z stop container
2024-03-01T18:30:17Z remove container
2024-03-01T18:30:17Z remove network
2024-03-01T18:38:56Z create pod network
2024-03-01T18:38:56Z create container XXXXX
2024-03-01T18:38:56Z start container
2024-03-01T18:57:44Z stop container
2024-03-01T18:57:45Z remove container
2024-03-01T18:57:45Z remove network
2024-03-01T19:04:17Z create pod network
2024-03-01T19:04:17Z create container XXXXXX
2024-03-01T19:04:17Z start container
2024-03-01T19:19:58Z stop container
2024-03-01T19:20:00Z remove container
2024-03-01T19:20:00Z remove network
2024-03-01T19:20:24Z create pod network
2024-03-01T19:20:24Z create container XXXXXXXX
2024-03-01T19:20:26Z start container
2024-03-01T19:21:05Z stop container
2024-03-01T19:21:07Z remove container
2024-03-01T19:21:07Z remove network
2024-03-01T19:21:34Z create pod network
2024-03-01T19:21:34Z create container XXXXXXXX
2024-03-01T19:21:35Z start container
3 Replies
whats the endpoint id?
1hdfqkkbw41swp
Thanks for looking into it
those logs are normal, it happens when your workers sale up and down