RunPod•4mo ago

Can not get a single endpoint to start

New to runpod, but not new to LLM's and running our own inference. So far, every single vLLM Template or vLLM worker that I have set up is failing. I use only the most basic settings, and have tried across a wide range of GPU types, with a variety of models (including the 'Quickstart' templates). Not a single worker has created an endpoint that works or runs the openai API endpoint. I get 'Initializing' and 'Running', but then no response at all to any request. Logs don't seem to have any information that help me diagnose the issue. Might well be that I am missing something silly, or that there is something amiss, I'm just not sure - could do with some assistance (and some better documentation) if there is someone from runpod that can help?

3 Replies

Jason•4mo ago

no response? how do you send requests? i see nothing wrong in your description, might need to dig deeper

Lattus•4mo ago

Lol, I'm having the same problem

Jason•4mo ago

maybe oom?

Gaming

Programming

Can not get a single endpoint to start

Did you find this page helpful?