zethos
Cannot find any model weights with `/models/huggingface-cache/hub/models...`
Hi, I made a docker image using the "STEP-2" mentioned in Readme file.
I created an template with docker image with below environment variables:
MODEL_NAME="migtissera/Tess-3-Mistral-Large-2-123B"
MAX_MODEL_LEN=65536
TENSOR_PARALLEL_SIZE=8
GPU_MEMORY_UTILIZATION=0.92
ENABLE_CHUNKED_PREFILL=1
NCCL_P2P_DISABLE=1
OMP_NUM_THREADS=1
ENFORCE_EAGER=1
The docker image:
snbhanja/tess3mistrallarge128b:latest
I tried to deploy this into a serverless with 8 48GB GPU.
I get the below error but I didn't get this error when the very first time it is deployed:
RuntimeError: Cannot find any model weights with `/models/huggingface-cache/hub/models--migtissera--Tess-3-Mistral-Large-2-123B/snapshots/8047f7cc9615909650b6a4ae5d13719d3e11594d
Even if i delete the serveless endpoint and try to make one using this, it gives same error.
Full log:
https://github.com/user-attachments/files/18603761/logs.11.txt
2 replies
RRunPod
•Created by zethos on 1/29/2025 in #⚡|serverless
Need help in fixing long running deployments in serverless vLLM
data:image/s3,"s3://crabby-images/fa29d/fa29dde42ebd2cec34583df33a0fab0516c3b6f9" alt="No description"
18 replies
RRunPod
•Created by zethos on 1/13/2024 in #⚡|serverless
#How to upload a file using a upload api in gpu serverless?
This is my current code, there is a separate fastapi running.
#⚡|serverless
4 replies