R
RunPod5mo ago
arken

RunPod Library + API

So I am attempting to create an API to either start/stop an existing pod or create a pod and then start/stop, I currently have something somewhat working:
@app.route("/start_model", methods=["POST"])
def start_model():
resume = runpod.resume_pod(pod_id=pod_id, gpu_count=1)
return jsonify({"message": resume}), 200

@app.route("/stop_model", methods=["POST"])
def stop_model():
stop = runpod.stop_pod(pod_id)
return jsonify({"message": stop}), 200
@app.route("/start_model", methods=["POST"])
def start_model():
resume = runpod.resume_pod(pod_id=pod_id, gpu_count=1)
return jsonify({"message": resume}), 200

@app.route("/stop_model", methods=["POST"])
def stop_model():
stop = runpod.stop_pod(pod_id)
return jsonify({"message": stop}), 200
However, this worked right up until the availability for the pod required (RTX 4090 that is also compatible with our network volume hosted in EU-RO) hit zero. The pod then disappeared from our listed pods and it could no longer be stopped/resumed using the RunPod library. My question is: Is there a way around this that you know of? I know that I can also spin up a pod using this library but it breaks once I begin attempting to specify the particular network volume/docker image (I assume because it doesn't know that only some GPUs in the Secure Cloud are compatible with the network volume?). If the RunPod python library isn't officially supported, is there any other ways through the official RunPod API to achieve what I am trying to do? As a side question, am I able to also run a command, after the docker image has finished building and the pod is ready, to spin up the Python API within the pod that I use to talk to my model for inferencing? That way I can remove the manual step of once I start the container I then have to enter the container and start the API myself. Any help is super appreciated! If you need any extra information feel free to ask. (Just incase I don't get any notifications pls @ me - I'd love to reply to you swiftly)
1 Reply
ashleyk
ashleyk5mo ago
@arken Typically you can only terminate pods with a network volume attached and not stop them, so you are actually working around it by using the API already, there is no other way around it. You CAN actually create a pod with a network volume attached by specifying the volume id of your network volume as well as the data center id that your network volume resides in. You can add a public IP and use the API to get the public port for SSH and use the Python paramiko library to make SSH calls to your pod to configure things.