xxxyyy
xxxyyy
RRunPod
Created by xxxyyy on 11/11/2024 in #⚡|serverless
Chat completion (template) not working with VLLM 0.6.3 + Serverless
I deployed https://huggingface.co/xingyaoww/Qwen2.5-Coder-32B-Instruct-AWQ-128k model through the Serverless UI, setting max model context window to 129024 and quantization to awq. I deploy it using the lastest version of vllm (0.6.3) provided by runpod. I ran into the following errors Client-side
ChatCompletion(id=None, choices=None, created=None, model=None, object='error', service_tier=None, system_fingerprint=None, usage=None, code=400, message="expected token 'end of print statement', got 'name'", param=None, type='BadRequestError')
ChatCompletion(id=None, choices=None, created=None, model=None, object='error', service_tier=None, system_fingerprint=None, usage=None, code=400, message="expected token 'end of print statement', got 'name'", param=None, type='BadRequestError')
4 replies