Requests stuck in IN_QUEUE status
We deployed a LLaVA-v1.6-34B model on 2xA100SXM infra as a serverless endpoint. When we send a request, we don't get a response. And the request is indefinitely in the
IN_QUEUE
status. Any suggestions for what we should we look at to start debugging this?
We've previously been successful deploying LLaVA-v1.5-13b. But again grateful for suggestions5 Replies
Click on the worker to view the logs, something is most likely erroring out.
Not seeing anything obviously wrong in the logs
You are trying to use an image created for a pod in serverless
For serverless, use this instead: https://github.com/ashleykleynhans/runpod-worker-llava
GitHub
GitHub - ashleykleynhans/runpod-worker-llava: LLaVA: Large Language...
LLaVA: Large Language and Vision Assistant | RunPod Serverless Worker - ashleykleynhans/runpod-worker-llava
THanks!