James
James
RRunPod
Created by James on 1/21/2025 in #⚡|serverless
Can not get a single endpoint to start
New to runpod, but not new to LLM's and running our own inference. So far, every single vLLM Template or vLLM worker that I have set up is failing. I use only the most basic settings, and have tried across a wide range of GPU types, with a variety of models (including the 'Quickstart' templates). Not a single worker has created an endpoint that works or runs the openai API endpoint. I get 'Initializing' and 'Running', but then no response at all to any request. Logs don't seem to have any information that help me diagnose the issue. Might well be that I am missing something silly, or that there is something amiss, I'm just not sure - could do with some assistance (and some better documentation) if there is someone from runpod that can help?
5 replies