RunpodR
Runpodβ€’16mo ago
naaviii

Urgent: Issue with Runpod vllm Serverless Endpoint

We are encountering a critical issue with the runpod vllm serverless endpoint. Specifically, when attaching a network volume, the following code is failing:

response = client.completions.create(
    model="llama3-dumm/llm",
    prompt=["hello? How are you "],
    temperature=0.8,
    max_tokens=600,
)

But the below is working :
response = client.chat.completions.create(
    model="llama3-dumm/llm",
    messages=[{'role': 'user', 'content': "hell0"}],
    max_tokens=100,
    temperature=0.9,
)


And This is the client object :
client = OpenAI(
api_key=api_key,
base_url=f"https://api.runpod.ai/v2/endpoint_id/openai/v1",
)

This behavior is unusual and suggests there might be a bug. Given our tight deadline, could you please investigate this issue as soon as possible? Your prompt assistance would be greatly appreciated.

Thank you very much for your help.
Was this page helpful?