R
RunPod•4mo ago
DEVIL_EGOX

vllm +openwebui

Hi guys, has anyone used Vllm as endpoint in OpenWebUI? I have created a serverless pod but it does not let me connect from openwebui (loaded locally). Does anyone know if I have to configure the external port and how it would be?
27 Replies
nerdylive
nerdylive•4mo ago
connect? how? its best to use the openai compatible api
DEVIL_EGOX
DEVIL_EGOXOP•4mo ago
It's because, for data confidentiality reasons, I want to use my own endpoint. I assumed that vLLM uses the same configuration as the OpenAI API, which is why I chose this option on Runpod.
nerdylive
nerdylive•4mo ago
Yes openai api in your endpoint Using the Runpod's openai api to check runpod docs for vllm endpoints
Ryan
Ryan•2mo ago
@DEVIL_EGOX did you ever get this working?
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
@Ryan Not yet🥲
Ryan
Ryan•2mo ago
Dang, it's something I really want to be able to do too
nerdylive
nerdylive•2mo ago
Use the openai compatible api... I've ever got it working
Ryan
Ryan•2mo ago
You got it working or never?
nerdylive
nerdylive•2mo ago
I wouldn't be giving advices just like this if I haven't tried it hahah I ever got it working
Ryan
Ryan•2mo ago
like this yeah? I havent been able to get it to connect
No description
nerdylive
nerdylive•2mo ago
Why? No... It's not like that, it should be like /v1 if I'm not wrong. Check the runpod documentations on using vllm, then find the url format
Ryan
Ryan•2mo ago
right..... i guess i left out the last part https://api.runpod.ai/v2/{RUNPOD_ENDPOINT_ID}/openai/v1 i got it working only problem is everytime i reload or change pages in my openwebui site it spins up a worker because the endpoint gets triggered when it looks for available models
nerdylive
nerdylive•2mo ago
Hmm you can modify the open webui code then rebuild it if you want Nice
Ryan
Ryan•2mo ago
actually seems like its not a big issue, its in the running status for milliseconds actually it may be an issue when the GPU im trying to use is unavailable... if openwebui doesnt get a response the side wont load for about a minute until the request times out
Aung Nanda Oo
Aung Nanda Oo•2mo ago
Guys I am facing issue while using run pod RUNPOD_CHATBOT_URL = "https://api.runpod.ai/v2/vllm-runpod-endpoint-id/openai/v1" vllm- should be hard coded since it does not have it anymore ? response = client.chat.completions.create( model=model_name, messages=[{"role": "user", "content": "What is the capital of Germany"}], temperature=0, top_p=0.8, max_tokens=2000, ) err
Ryan
Ryan•2mo ago
@Aung Nanda Oo you connection URL in openwebui should be set to this: https://api.runpod.ai/v2/YourServerlessEndpointIDhere/openai/v1
Aung Nanda Oo
Aung Nanda Oo•2mo ago
Thanks I got it!
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
Hi guys, again, I have tried to use the address as mentioned (https://api.runpod.ai/v2/a2auhmx8h7iu3x/openai/v1/) but I still can't connect. Help me, please 🥲 @nerdylive Any suggestions, please
nerdylive
nerdylive•2mo ago
what error? "can't connect" is too ambiguous
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
No description
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
No description
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
Use this configuration in the endpoint
No description
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
@nerdylive Maybe I am misconfiguring the endpoint.
nerdylive
nerdylive•2mo ago
no it seems like its correct if you use template from runpod but.. did you forget api key? @DEVIL_EGOX btw pinging me wont yield faster response times
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
If I don't put the api key, should I declare it in the variables configuration? It would be something like this (API_KEY = XXXXXX) ? 😬
nerdylive
nerdylive•2mo ago
im not sure, i forgot already. can you check the documentations of openwebui if they can be configured that way? API key should be your runpod api key that has access to your endpoint, id suggest using restricted to your endpoint, read only
DEVIL_EGOX
DEVIL_EGOXOP•2mo ago
thank you very much I solved it, it was only the api key that was missing.

Did you find this page helpful?