New vllm Serverless interface issue
Hi guys, I logged in early to run my vllm-worker, which have been worker perfectly before, but I noticed that the interface for serverless have changed. I noticed there's no openai compatible url anymore. My codes were also experiencing internal server errors. Would appreciate it if you could share fixes to this issue. I'm not sure if this page is updated according to the new interface: https://docs.runpod.io/serverless/workers/vllm/openai-compatibility
OpenAI compatibility | RunPod Documentation
Discover the vLLM Worker, a cloud-based AI model that integrates with OpenAI's API for seamless interaction. With its streaming and non-streaming capabilities, it's ideal for chatbots, conversational AI, and natural language processing applications.
8 Replies
the same url?
base_url=f"https://api.runpod.ai/v2/{RUNPOD_ENDPOINT_ID}/openai/v1",
should be like that right
How is your endpoint? any workers up? any logs?yeah everything's the same. I tried to run it several times and was able to get it working again. I guess I missed out on rerun a code chunk.
i've just tried it, it works if the vllm is right, and the cliennt is right
nicee
i've just tried a new model, got the same error, as yours then figured out from the logs that a few model doesn't support "system" messages, it has to be assistant and user only
yeah, this prompt format only works for llama models, which is what I'm currently using
havent tried other models with different prompt format yet like mistral
just an FYI in case other are experience this issue, the previous endpoint format was vllm-[pod id], the current version doesnt have "vllm-" anymore
ahh you hardcoded the vllm-?
it's like this in my script Runpod_endpoint = "vllm-xxx", previously I only replaced the xxx part
oohh ic