Sagor Sarker Comments - Answer Overflow

Sagor Sarker

Posts Comments

RRunPod

•Created by vlad000ss on 12/1/2024 in #⚡｜serverless

Custom vLLM OpenAI compatible API

I don't know why runpod vLLM is not solving the tool-calling issue yet. But it's an essential need.

25 replies

RRunPod

•Created by vlad000ss on 12/1/2024 in #⚡｜serverless

Custom vLLM OpenAI compatible API

One of the main problem in runpod vLLM based docker is it's not working for tool calling. That is the reason I move for the custom docker builder using vLLM serve method. You are right, maybe I directly serve as "vllm serve ................ " in the docker starting point might be not compatible with runpod. I will try to follow your suggestions. Thank you.

25 replies

RRunPod

•Created by vlad000ss on 12/1/2024 in #⚡｜serverless

Custom vLLM OpenAI compatible API

Hi, there no error showing. It stays in queue and I can't see any process log in the worker machine. I didn't specify the port. How can I do that? I tried with runpod proxy method, then it was working for a single worker machine.

25 replies

RRunPod

•Created by vlad000ss on 12/1/2024 in #⚡｜serverless

Custom vLLM OpenAI compatible API

Hi @nerdylive I have exposed 8000 port as TCP port as the server is running in this port. 1. I am trying to access it both in "request" method exist in the servereless. It's infinitely in queue. 2. I tried programmtically like below:

curl --request POST \
     --url https://api.runpod.ai/v2/my_endpoint_id/runsync \
     --header "accept: application/json" \
     --header "authorization: my_runpod_api_key" \
     --header "content-type: application/json" \
     --data '
{
  "input": {
    "prompt": "What is the weather in Dhaka?"
  }
}
'

curl --request POST \
     --url https://api.runpod.ai/v2/my_endpoint_id/runsync \
     --header "accept: application/json" \
     --header "authorization: my_runpod_api_key" \
     --header "content-type: application/json" \
     --data '
{
  "input": {
    "prompt": "What is the weather in Dhaka?"
  }
}
'

3. Tried openai compatibility as I served using vLLM serve command

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.runpod.ai/v2/my_endpoint_id/openai/v1", 
    api_key="my_runpod_api_key"
)

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user",
            "content": "আজকে ঢাকার আবহাওয়া কেমন?"}    ],
    tools=[{
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        }
                    },
                    "required": ["location"]
                }
            }
        }],
    tool_choice="auto"
)

print(response)

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.runpod.ai/v2/my_endpoint_id/openai/v1", 
    api_key="my_runpod_api_key"
)

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user",
            "content": "আজকে ঢাকার আবহাওয়া কেমন?"}    ],
    tools=[{
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        }
                    },
                    "required": ["location"]
                }
            }
        }],
    tool_choice="auto"
)

print(response)

25 replies

RRunPod

•Created by vlad000ss on 12/1/2024 in #⚡｜serverless

Custom vLLM OpenAI compatible API

I am having the same issue. I have a prepare a docker container with custom vLLM serving there. Created a template with that docker using docker hub. In serverless machine got created and I can use the endpoint using localhost:port but from outside I can't access the server. It got stuck. Maybe the it can't make connection using the above openai script. Anyone have any clue?

25 replies

Gaming

Programming