Sagor Sarker
Sagor Sarker
RRunPod
Created by vlad000ss on 12/1/2024 in #⚡|serverless
Custom vLLM OpenAI compatible API
I don't know why runpod vLLM is not solving the tool-calling issue yet. But it's an essential need.
25 replies
RRunPod
Created by vlad000ss on 12/1/2024 in #⚡|serverless
Custom vLLM OpenAI compatible API
One of the main problem in runpod vLLM based docker is it's not working for tool calling. That is the reason I move for the custom docker builder using vLLM serve method. You are right, maybe I directly serve as "vllm serve ................ " in the docker starting point might be not compatible with runpod. I will try to follow your suggestions. Thank you.
25 replies
RRunPod
Created by vlad000ss on 12/1/2024 in #⚡|serverless
Custom vLLM OpenAI compatible API
Hi, there no error showing. It stays in queue and I can't see any process log in the worker machine. I didn't specify the port. How can I do that? I tried with runpod proxy method, then it was working for a single worker machine.
25 replies
RRunPod
Created by vlad000ss on 12/1/2024 in #⚡|serverless
Custom vLLM OpenAI compatible API
Hi @nerdylive I have exposed 8000 port as TCP port as the server is running in this port. 1. I am trying to access it both in "request" method exist in the servereless. It's infinitely in queue. 2. I tried programmtically like below:
curl --request POST \
--url https://api.runpod.ai/v2/my_endpoint_id/runsync \
--header "accept: application/json" \
--header "authorization: my_runpod_api_key" \
--header "content-type: application/json" \
--data '
{
"input": {
"prompt": "What is the weather in Dhaka?"
}
}
'
curl --request POST \
--url https://api.runpod.ai/v2/my_endpoint_id/runsync \
--header "accept: application/json" \
--header "authorization: my_runpod_api_key" \
--header "content-type: application/json" \
--data '
{
"input": {
"prompt": "What is the weather in Dhaka?"
}
}
'
3. Tried openai compatibility as I served using vLLM serve command
import os
from openai import OpenAI

client = OpenAI(
base_url="https://api.runpod.ai/v2/my_endpoint_id/openai/v1",
api_key="my_runpod_api_key"
)

response = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{"role": "user",
"content": "আজকে ঢাকার আবহাওয়া কেমন?"} ],
tools=[{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
}],
tool_choice="auto"
)

print(response)
import os
from openai import OpenAI

client = OpenAI(
base_url="https://api.runpod.ai/v2/my_endpoint_id/openai/v1",
api_key="my_runpod_api_key"
)

response = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{"role": "user",
"content": "আজকে ঢাকার আবহাওয়া কেমন?"} ],
tools=[{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
}],
tool_choice="auto"
)

print(response)
25 replies
RRunPod
Created by vlad000ss on 12/1/2024 in #⚡|serverless
Custom vLLM OpenAI compatible API
I am having the same issue. I have a prepare a docker container with custom vLLM serving there. Created a template with that docker using docker hub. In serverless machine got created and I can use the endpoint using localhost:port but from outside I can't access the server. It got stuck. Maybe the it can't make connection using the above openai script. Anyone have any clue?
25 replies