xxxyyy Comments - Answer Overflow

xxxyyy

•Created by xxxyyy on 11/11/2024 in #⚡｜serverless

Chat completion (template) not working with VLLM 0.6.3 + Serverless

There isn't any reported error on the Qwen Github regarding the chat template (it uses the SAME template as a model that was released months ago), so i suspect this is a runpod specific error?

4 replies

RRunPod

•Created by xxxyyy on 11/11/2024 in #⚡｜serverless

Chat completion (template) not working with VLLM 0.6.3 + Serverless

Here's a partial error from server-end:

2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] jinja2.exceptions.TemplateSyntaxError: expected token 'end of print statement', got 'name'\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] File "<unknown>", line 27, in template\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] raise rewrite_traceback_stack(source=source)\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 939, in handle_exception\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] self.handle_exception(source=source_hint)\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 768, in compile\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]

2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] jinja2.exceptions.TemplateSyntaxError: expected token 'end of print statement', got 'name'\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] File "<unknown>", line 27, in template\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] raise rewrite_traceback_stack(source=source)\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 939, in handle_exception\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] self.handle_exception(source=source_hint)\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]
[error]
ERROR 11-11 21:14:55 serving_chat.py:158] File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 768, in compile\n
2024-11-11 16:14:55.477
[q3ubsnv48i2ucs]

4 replies

RRunPod

•Created by xxxyyy on 11/11/2024 in #⚡｜serverless

Chat completion (template) not working with VLLM 0.6.3 + Serverless

This request runs fine without error:

response = client.completions.create(
    model="xingyaoww/Qwen2.5-Coder-32B-Instruct-AWQ-128k",
    prompt="Runpod is the best platform because",
    temperature=0,
    max_tokens=100,
)

response = client.completions.create(
    model="xingyaoww/Qwen2.5-Coder-32B-Instruct-AWQ-128k",
    prompt="Runpod is the best platform because",
    temperature=0,
    max_tokens=100,
)

But this request give me error:

response = client.chat.completions.create(
    model=MODEL_NAME,
    messages=[{"role": "user", "content": "Who are you?"}],
    temperature=0,
    max_tokens=100,
)

response = client.chat.completions.create(
    model=MODEL_NAME,
    messages=[{"role": "user", "content": "Who are you?"}],
    temperature=0,
    max_tokens=100,
)

4 replies

Gaming

Programming