How does runpod handle pod terminating
It is very likely that runpod simply sends a sigkill to the main container process. This is really annoying when you are trying to handle termination. Could you please provide information on how your orche system handles pod termination and how I can get the OS signal
16 Replies
Just to mention: kubernetes sends SIGTERM
It is crucial for us, because of LLM streaming. We wanna utilize some big GPUs, but it is not really possible, because any progressing stream will be terminated no matter what 😦
On Spots?
What do you mean it will be terminated no matter what?, when you rent an on demand pods, it wont be terminated until its stopped i believe
Yes, it is, but while stream is in the progress - it would be dropped without any graceful period. My point is that the pod should have a graceful period instead of an instant sigkill
in spot or on demand?
on demand
The pod dies -> stream dies
is that what you mean, i still don't get what you mean.
what were you hoping for then?
yeah you right
alright what are you hoping it to be
for a graceful minute
pod dies -> pod still alive for a minute -> pod dies?
pod receives termination -> sigterm -> 1min alive (graceful period) -> pod sends sigkill and dies
oh why should it be like that, just shut down your program manually and terminate the pod right?
or terminate after your streams is finished
Because kubernetes does that way and it allows pod to handle graceful term, instead of instant annihillation (runpod does that way)
yes because after the pod stops, it stops charging you, isn't that how other cloud supposed to work?
sorry im still trying to understand what your use case is
but if you want you can directly write a #🧐|feedback
I appreciate your advice, I'll send it as feedback tho
yup
from what im getting, the pod should stop after the program stops, i think you can do that by using the runpod's api to stop pods / terminate or
runpodctl pod
command
after your program stops, so you can run your program via a sh script, next line is that command or send a request to the runpod api to stop pod