Hi, is there currently an outage to Serverless API?
The request are "IN_QUEUE" forever...
27 Replies
I've had similar issues because I provided an incorrect input body, are you able to provide the body you're using for your severless endpoints?
As well as what template you're using and any other information that you think could be useful
This is my endpoint
1ifuoxegzxuhb4
We are using vLLM
I don't think input body is wrong though because the same service has been running smoothly for 2-3 weeks already.
Things started to become unstable since weekend, and today is full outage for us...
Got it, if you go look at your severless endpoint after you send a request, are you able to see if it has any workers running? We might not have any availability on the GPUs you'v chosen
You can see all the requests are pending
]
Workers are running
Odd
And are boosting correctly.
I'll bring this up internally as I'm not too sure what the issue could be, give me a moment.
Thanks!
What is the docker image you are using? Our worker vLLM?
Ah. sorry, I was wrong. It is not vLLM. We use our own exllama image.
i think you need to debug your docker image here
it appears to be broken
ok. any logs on your end that you can share?
(to indicate that it is broken?)
were u able to confirm exllama works on gpu pod?
Ah.. it has been working for 2-3 weeks (we used it very actively)
same image / model
no new docker builds?
interesting
based on the logs, the requests are not getting to the handler.
This is our handler code:
It says start generating so I feel that it is reaching the handler
hmm. you are correct.
I guess two things here:
1) Maybe try to create a test endpoint with like maybe 3 max workers and see if it works there. Cause then ud isolate at least if its the original endpoint or your code (if both fail)
So it get to the handler, but stuck 😆
2) I just tried with my LLM, which uses an async generator too and works fine
https://github.com/justinwlin/Runpod-OpenLLM-Pod-and-Serverless/blob/main/handler.py
got it.
So either you got a bad endpoint somehow / or your code or input, something is off.
thanks. let me look into that. Could be the issue with ExllamaV2.