RunPod•5mo ago

vllm +openwebui

Hi guys, has anyone used Vllm as endpoint in OpenWebUI? I have created a serverless pod but it does not let me connect from openwebui (loaded locally). Does anyone know if I have to configure the external port and how it would be?

27 Replies

Jason•5mo ago

connect? how? its best to use the openai compatible api

DEVIL_EGOXOP•5mo ago

It's because, for data confidentiality reasons, I want to use my own endpoint. I assumed that vLLM uses the same configuration as the OpenAI API, which is why I chose this option on Runpod.

Jason•5mo ago

Yes openai api in your endpoint Using the Runpod's openai api to check runpod docs for vllm endpoints

Ryan•4mo ago

@DEVIL_EGOX did you ever get this working?

DEVIL_EGOXOP•4mo ago

@Ryan Not yet🥲

Ryan•4mo ago

Dang, it's something I really want to be able to do too

Jason•4mo ago

Use the openai compatible api... I've ever got it working

Ryan•4mo ago

You got it working or never?

Jason•4mo ago

I wouldn't be giving advices just like this if I haven't tried it hahah I ever got it working

Ryan•4mo ago

like this yeah? I havent been able to get it to connect

Jason•4mo ago

Why? No... It's not like that, it should be like /v1 if I'm not wrong. Check the runpod documentations on using vllm, then find the url format

Ryan•4mo ago

right..... i guess i left out the last part https://api.runpod.ai/v2/{RUNPOD_ENDPOINT_ID}/openai/v1 i got it working only problem is everytime i reload or change pages in my openwebui site it spins up a worker because the endpoint gets triggered when it looks for available models

Jason•4mo ago

Hmm you can modify the open webui code then rebuild it if you want Nice

Ryan•4mo ago

actually seems like its not a big issue, its in the running status for milliseconds actually it may be an issue when the GPU im trying to use is unavailable... if openwebui doesnt get a response the side wont load for about a minute until the request times out

Aung Nanda Oo•4mo ago

Guys I am facing issue while using run pod RUNPOD_CHATBOT_URL = "https://api.runpod.ai/v2/vllm-runpod-endpoint-id/openai/v1" vllm- should be hard coded since it does not have it anymore ? response = client.chat.completions.create( model=model_name, messages=[{"role": "user", "content": "What is the capital of Germany"}], temperature=0, top_p=0.8, max_tokens=2000, ) err

message.txt

Ryan•4mo ago

@Aung Nanda Oo you connection URL in openwebui should be set to this: https://api.runpod.ai/v2/YourServerlessEndpointIDhere/openai/v1

Aung Nanda Oo•4mo ago

Thanks I got it!

DEVIL_EGOXOP•3mo ago

Hi guys, again, I have tried to use the address as mentioned (https://api.runpod.ai/v2/a2auhmx8h7iu3x/openai/v1/) but I still can't connect. Help me, please 🥲 @nerdylive Any suggestions, please

Jason•3mo ago

what error? "can't connect" is too ambiguous

DEVIL_EGOXOP•3mo ago

Use this configuration in the endpoint

DEVIL_EGOXOP•3mo ago

@nerdylive Maybe I am misconfiguring the endpoint.

Jason•3mo ago

no it seems like its correct if you use template from runpod but.. did you forget api key? @DEVIL_EGOX btw pinging me wont yield faster response times

DEVIL_EGOXOP•3mo ago

If I don't put the api key, should I declare it in the variables configuration? It would be something like this (API_KEY = XXXXXX) ? 😬

Jason•3mo ago

im not sure, i forgot already. can you check the documentations of openwebui if they can be configured that way? API key should be your runpod api key that has access to your endpoint, id suggest using restricted to your endpoint, read only

DEVIL_EGOXOP•3mo ago

thank you very much I solved it, it was only the api key that was missing.

Gaming

Programming

vllm +openwebui

Did you find this page helpful?