naaviii
naaviii
RRunPod
Created by naaviii on 8/29/2024 in #⚡|serverless
Urgent: Issue with Runpod vllm Serverless Endpoint
We are encountering a critical issue with the runpod vllm serverless endpoint. Specifically, when attaching a network volume, the following code is failing: response = client.completions.create( model="llama3-dumm/llm", prompt=["hello? How are you "], temperature=0.8, max_tokens=600, ) But the below is working : response = client.chat.completions.create( model="llama3-dumm/llm", messages=[{'role': 'user', 'content': "hell0"}], max_tokens=100, temperature=0.9, ) And This is the client object : client = OpenAI(
api_key=api_key, base_url=f"https://api.runpod.ai/v2/endpoint_id/openai/v1", )
This behavior is unusual and suggests there might be a bug. Given our tight deadline, could you please investigate this issue as soon as possible? Your prompt assistance would be greatly appreciated. Thank you very much for your help.
22 replies