Chat template error for mistral-7b
I am beginner to this things and need some help to resolve this issue. I am using a fine tuned version of mistralai/Mistral-7B-Instruct-v0.3 in a 16 bit float precision and doing the inference using openai interface for which i receiving this error "TypeError: 'NoneType' object is not subscriptable".
Additionally I was also facing kv cache 26k and model length 32k mismatch error.
I am doubting whether the serverless configurations the i have set is right or wrong.
A sample guide would be helpful.
4 Replies
I have the same issue
some docs might help:
https://github.com/vllm-project/vllm/blob/main/docs/source/serving/distributed_serving.rst
https://docs.vllm.ai/en/latest/models/engine_args.html
GitHub
vllm/docs/source/serving/distributed_serving.rst at main · vllm-pro...
A high-throughput and memory-efficient inference and serving engine for LLMs - vllm-project/vllm
whats the fix?
Maybe you should change some of the arguments when creating the endpoint
For the kv cache and model len mismatch