RunPod•4mo ago

Chat template error for mistral-7b

2024-10-14T10:19:42.283509829Z --- Starting Serverless Worker |  Version 1.7.0 ---
2024-10-14T10:19:42.283511520Z ERROR 10-14 10:19:42 serving_chat.py:155] Error in applying chat template from request: As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.
2024-10-14T10:19:42.283814574Z /src/engine.py:183: RuntimeWarning: coroutine 'AsyncMultiModalItemTracker.all_mm_data' was never awaited
2024-10-14T10:19:42.283849707Z   response_generator = await generator_function(request, raw_request=dummy_request)
2024-10-14T10:19:42.283854767Z RuntimeWarning: Enable tracemalloc to get the object allocation traceback

2024-10-14T10:19:42.283509829Z --- Starting Serverless Worker |  Version 1.7.0 ---
2024-10-14T10:19:42.283511520Z ERROR 10-14 10:19:42 serving_chat.py:155] Error in applying chat template from request: As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.
2024-10-14T10:19:42.283814574Z /src/engine.py:183: RuntimeWarning: coroutine 'AsyncMultiModalItemTracker.all_mm_data' was never awaited
2024-10-14T10:19:42.283849707Z   response_generator = await generator_function(request, raw_request=dummy_request)
2024-10-14T10:19:42.283854767Z RuntimeWarning: Enable tracemalloc to get the object allocation traceback

I am beginner to this things and need some help to resolve this issue. I am using a fine tuned version of mistralai/Mistral-7B-Instruct-v0.3 in a 16 bit float precision and doing the inference using openai interface for which i receiving this error "TypeError: 'NoneType' object is not subscriptable". Additionally I was also facing kv cache 26k and model length 32k mismatch error. I am doubting whether the serverless configurations the i have set is right or wrong. A sample guide would be helpful.

4 Replies