Status endpoint only returns "COMPLETED" but no answer to the question
I'm currently using the v2/model_id/status/run_id endpoint and the results I get is follows:
{"delaytime": 26083, "executionTime":35737, "id": **, "status": "COMPLETED"}
My stream endpoint works fine but for my purposes I'd rather wait longer and retrieve the entire result at once, how am I supposed to do that?
Thank you
{"delaytime": 26083, "executionTime":35737, "id": **, "status": "COMPLETED"}
My stream endpoint works fine but for my purposes I'd rather wait longer and retrieve the entire result at once, how am I supposed to do that?
Thank you
Solution
Okay…
1) What is deployed to runpod is:
https://github.com/hommayushi3/exllama-runpod-serverless/blob/master/handler.py
2) U need to change the line i specified on the bottom of the file. u should have a copy of this github repo locally
3) U have to rebuild and redeploy to runpod the built image
4) When u call it in the future will work
1) What is deployed to runpod is:
https://github.com/hommayushi3/exllama-runpod-serverless/blob/master/handler.py
2) U need to change the line i specified on the bottom of the file. u should have a copy of this github repo locally
3) U have to rebuild and redeploy to runpod the built image
4) When u call it in the future will work
GitHub
Contribute to hommayushi3/exllama-runpod-serverless development by creating an account on GitHub.

