RunPod•3mo ago

New vllm Serverless interface issue

Hi guys, I logged in early to run my vllm-worker, which have been worker perfectly before, but I noticed that the interface for serverless have changed. I noticed there's no openai compatible url anymore. My codes were also experiencing internal server errors. Would appreciate it if you could share fixes to this issue. I'm not sure if this page is updated according to the new interface: https://docs.runpod.io/serverless/workers/vllm/openai-compatibility

OpenAI compatibility | RunPod Documentation

Discover the vLLM Worker, a cloud-based AI model that integrates with OpenAI's API for seamless interaction. With its streaming and non-streaming capabilities, it's ideal for chatbots, conversational AI, and natural language processing applications.

8 Replies

nerdylive•3mo ago

the same url? base_url=f"https://api.runpod.ai/v2/{RUNPOD_ENDPOINT_ID}/openai/v1", should be like that right How is your endpoint? any workers up? any logs?

will_tOP•3mo ago

yeah everything's the same. I tried to run it several times and was able to get it working again. I guess I missed out on rerun a code chunk.

nerdylive•3mo ago

i've just tried it, it works if the vllm is right, and the cliennt is right nicee i've just tried a new model, got the same error, as yours then figured out from the logs that a few model doesn't support "system" messages, it has to be assistant and user only

will_tOP•3mo ago

yeah, this prompt format only works for llama models, which is what I'm currently using havent tried other models with different prompt format yet like mistral

will_tOP•3mo ago

just an FYI in case other are experience this issue, the previous endpoint format was vllm-[pod id], the current version doesnt have "vllm-" anymore

nerdylive•3mo ago

ahh you hardcoded the vllm-?

will_tOP•3mo ago

it's like this in my script Runpod_endpoint = "vllm-xxx", previously I only replaced the xxx part

nerdylive•3mo ago

oohh ic

Gaming

Programming

New vllm Serverless interface issue

Did you find this page helpful?