best practice to terminate pods on job completion
I have a one time job I want to run as a GPU pod. Currently the container gets restarted as soon as the job finishes. What's the best way to terminate the pod after completion?
6 Replies
GitHub
GitHub - runpod/runpodctl: 🧰 | RunPod CLI for pod management
🧰 | RunPod CLI for pod management. Contribute to runpod/runpodctl development by creating an account on GitHub.
Thanks for the pointer. You mean I should install it in the container and add my credentials to it, and then terminate as soon as the job is done?
You may also want to consider using serverless instead of GPU cloud.
I don't have a large number of jobs (~20) each running for ~1h. Does that still make sense?
Yeah, I think this is fine.
Whatever you have running on serverless, is essentially what could be running on GPU Cloud, except that you are just calling a handler.py instead that calls the runpod.start() method. can check their documentation.
I should really improve this lol, but it should give you a decent pseudo example:
https://discord.com/channels/912829806415085598/1194695853026328626
I also run jobs every so often like transcriptions on runpod, but I dont want to be spinning up a pod every time
So I just ping it to a serverless function to queue up
and wait for the responses to finish
GPU Cloud the difference is usually u start up some openssh server and jupyter lab, but the environment is the same setup if it works in GPU Cloud, will work in serverless (most of the time with some exceptions)
Okay awesome, thank you for the link to the how to thread. I will give it a look