R
RunPod•3mo ago
DEVIL_EGOX

vllm +openwebui

Hi guys, has anyone used Vllm as endpoint in OpenWebUI? I have created a serverless pod but it does not let me connect from openwebui (loaded locally). Does anyone know if I have to configure the external port and how it would be?
27 Replies
nerdylive
nerdylive•3mo ago
connect? how? its best to use the openai compatible api
DEVIL_EGOX
DEVIL_EGOXOP•3mo ago
It's because, for data confidentiality reasons, I want to use my own endpoint. I assumed that vLLM uses the same configuration as the OpenAI API, which is why I chose this option on Runpod.
nerdylive
nerdylive•3mo ago
Yes openai api in your endpoint Using the Runpod's openai api to check runpod docs for vllm endpoints
Ryan
Ryan•4w ago
@DEVIL_EGOX did you ever get this working?
DEVIL_EGOX
DEVIL_EGOXOP•4w ago
@Ryan Not yet🥲
Ryan
Ryan•4w ago
Dang, it's something I really want to be able to do too
nerdylive
nerdylive•4w ago
Use the openai compatible api... I've ever got it working
Ryan
Ryan•4w ago
You got it working or never?
nerdylive
nerdylive•4w ago
I wouldn't be giving advices just like this if I haven't tried it hahah I ever got it working
Ryan
Ryan•4w ago
like this yeah? I havent been able to get it to connect
No description
nerdylive
nerdylive•4w ago
Why? No... It's not like that, it should be like /v1 if I'm not wrong. Check the runpod documentations on using vllm, then find the url format
Ryan
Ryan•4w ago
right..... i guess i left out the last part https://api.runpod.ai/v2/{RUNPOD_ENDPOINT_ID}/openai/v1 i got it working only problem is everytime i reload or change pages in my openwebui site it spins up a worker because the endpoint gets triggered when it looks for available models
nerdylive
nerdylive•4w ago
Hmm you can modify the open webui code then rebuild it if you want Nice
Ryan
Ryan•4w ago
actually seems like its not a big issue, its in the running status for milliseconds actually it may be an issue when the GPU im trying to use is unavailable... if openwebui doesnt get a response the side wont load for about a minute until the request times out
Aung Nanda Oo
Aung Nanda Oo•3w ago
Guys I am facing issue while using run pod RUNPOD_CHATBOT_URL = "https://api.runpod.ai/v2/vllm-runpod-endpoint-id/openai/v1" vllm- should be hard coded since it does not have it anymore ? response = client.chat.completions.create( model=model_name, messages=[{"role": "user", "content": "What is the capital of Germany"}], temperature=0, top_p=0.8, max_tokens=2000, ) err
Ryan
Ryan•3w ago
@Aung Nanda Oo you connection URL in openwebui should be set to this: https://api.runpod.ai/v2/YourServerlessEndpointIDhere/openai/v1
Aung Nanda Oo
Aung Nanda Oo•3w ago
Thanks I got it!
DEVIL_EGOX
DEVIL_EGOXOP•3w ago
Hi guys, again, I have tried to use the address as mentioned (https://api.runpod.ai/v2/a2auhmx8h7iu3x/openai/v1/) but I still can't connect. Help me, please 🥲 @nerdylive Any suggestions, please
nerdylive
nerdylive•3w ago
what error? "can't connect" is too ambiguous
DEVIL_EGOX
DEVIL_EGOXOP•3w ago
No description
DEVIL_EGOX
DEVIL_EGOXOP•3w ago
No description
DEVIL_EGOX
DEVIL_EGOXOP•3w ago
Use this configuration in the endpoint
No description
DEVIL_EGOX
DEVIL_EGOXOP•3w ago
@nerdylive Maybe I am misconfiguring the endpoint.
nerdylive
nerdylive•3w ago
no it seems like its correct if you use template from runpod but.. did you forget api key? @DEVIL_EGOX btw pinging me wont yield faster response times
DEVIL_EGOX
DEVIL_EGOXOP•3w ago
If I don't put the api key, should I declare it in the variables configuration? It would be something like this (API_KEY = XXXXXX) ? 😬
nerdylive
nerdylive•3w ago
im not sure, i forgot already. can you check the documentations of openwebui if they can be configured that way? API key should be your runpod api key that has access to your endpoint, id suggest using restricted to your endpoint, read only
DEVIL_EGOX
DEVIL_EGOXOP•3w ago
thank you very much I solved it, it was only the api key that was missing.

Did you find this page helpful?