Official template vllm-latest is broken
Hi everyone, I'm trying to deploy vLLM Pod using the official vllm-latest template, but I get the following error:
Traceback (most recent call last):
File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 391, in run_mp_engine
raise e
File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 380, in run_mp_engine
engine = MQLLMEngine.from_engine_args(engine_args=engine_args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 118, in from_engine_args
engine_config = engine_args.create_engine_config(usage_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1075, in create_engine_config
model_config = self.create_model_config()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 998, in create_model_config
return ModelConfig(
^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 302, in init
hf_config = get_config(self.model, trust_remote_code, revision,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/vllm/transformers_utils/config.py", line 201, in get_config
raise ValueError(f"No supported config format found in {model}")
ValueError: No supported config format found in meta-llama/Meta-Llama-3.1-8B-Instruct
I did not apply any overrides to the default template (attached screenshot in message), used A40 on secure cloud (CA region).

3 Replies
It's not our template
Maybe try trust remote code, n also your hf api key in the env variable
Check hf docs on how to set your api key(the env key name)
That "Container Start Command" section is the equivalent of CMD in a Dockerfile. We have a quick-deploy vLLM template. I recommend going through this documentation https://docs.runpod.io/category/vllm-endpoint
vLLM Endpoint | RunPod Documentation
Deploy blazingly fast OpenAI-compatible serverless endpoints for any LLM.