Ben
RRunPod
•Created by Ben on 3/10/2024 in #⚡|serverless
Serverless API Question
Hi, I am currently using this guide https://doc.runpod.io/reference/runpod-apis, and attempting to retrieve my results with the status request. However, the response that I get just looks like
{
"delayTime": 66679,
"executionTime": 41266,
"id": "5dbd4fb0-b6f9-44d9-a242-820d9ddbc929-u1",
"status": "COMPLETED"
}
I am wondering where the input and output shown in the documentation is? In my inference function, I'm simply returning the result as a json object. Is this an issue with my function?
8 replies
RRunPod
•Created by Ben on 3/9/2024 in #⚡|serverless
Serverless Inference
Hi, I have been using runpod to train my model, and am very interested in using serverless computing to deploy it. I have successfully created a docker image that loads the model and contains an inference endpoint function. However, the model is rather large, and I am curious if there is a way to hold the model in ram to avoid loading it every time the container is stopped and restarted? If not, could anyone recommend another resource for model deployment? Is a traditional server a better option here?
4 replies