vllm worker OpenAI stream

Hi everyone, I followed the Runpod documentation to create a simple OpenAI client code using a serverless endpoint for the Llava model (llava-hf/llava-1.5-7b-hf). However, I encountered the following error:
ChatCompletion(id=None, choices=None, created=None, model=None, object='error', service_tier=None, system_fingerprint=None, usage=None, code=400, message='As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.', param=None, type='BadRequestError')
ChatCompletion(id=None, choices=None, created=None, model=None, object='error', service_tier=None, system_fingerprint=None, usage=None, code=400, message='As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.', param=None, type='BadRequestError')
Has anyone experienced this issue? Any suggestions for resolving it? Code:
client = OpenAI(
api_key="key",
base_url=f"https://api.runpod.ai/v2/123123123/openai/v1"
)
response = client.chat.completions.create(
model="llava-hf/llava-1.5-7b-hf",
messages=[{"role": "user", "content": "Hello, how can I use RunPod's serverless platform?"}],
temperature=0.7,
max_tokens=100
)
print(response.choices[0].message.content)
client = OpenAI(
api_key="key",
base_url=f"https://api.runpod.ai/v2/123123123/openai/v1"
)
response = client.chat.completions.create(
model="llava-hf/llava-1.5-7b-hf",
messages=[{"role": "user", "content": "Hello, how can I use RunPod's serverless platform?"}],
temperature=0.7,
max_tokens=100
)
print(response.choices[0].message.content)
1 Reply
nerdylive
nerdylive3d ago
Hugging Face Forums
Chat_template is not set & throwing error
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.
Want results from more Discord servers?
Add your server