Status endpoint only returns "COMPLETED" but no answer to the question

I'm currently using the v2/model_id/status/run_id endpoint and the results I get is follows:

{"delaytime": 26083, "executionTime":35737, "id": **, "status": "COMPLETED"}

My stream endpoint works fine but for my purposes I'd rather wait longer and retrieve the entire result at once, how am I supposed to do that?

Thank you
Solution
Okay
1) What is deployed to runpod is:

https://github.com/hommayushi3/exllama-runpod-serverless/blob/master/handler.py

2) U need to change the line i specified on the bottom of the file. u should have a copy of this github repo locally

3) U have to rebuild and redeploy to runpod the built image

4) When u call it in the future will work
GitHub
Contribute to hommayushi3/exllama-runpod-serverless development by creating an account on GitHub.
Was this page helpful?