R
RunPod7d ago
James

Can not get a single endpoint to start

New to runpod, but not new to LLM's and running our own inference. So far, every single vLLM Template or vLLM worker that I have set up is failing. I use only the most basic settings, and have tried across a wide range of GPU types, with a variety of models (including the 'Quickstart' templates). Not a single worker has created an endpoint that works or runs the openai API endpoint. I get 'Initializing' and 'Running', but then no response at all to any request. Logs don't seem to have any information that help me diagnose the issue. Might well be that I am missing something silly, or that there is something amiss, I'm just not sure - could do with some assistance (and some better documentation) if there is someone from runpod that can help?
3 Replies
nerdylive
nerdylive6d ago
no response? how do you send requests? i see nothing wrong in your description, might need to dig deeper
Lattus
Lattus6d ago
Lol, I'm having the same problem
nerdylive
nerdylive6d ago
maybe oom?

Did you find this page helpful?