PatriotSoul
RRunPod
•Created by PatriotSoul on 6/5/2024 in #⚡|serverless
Loading LLama 70b on vllm template serverless cant answer a simple question like "what is your name"
I am loading with 1 worker and 2 GPU's 80g
But the model just cant performance at all, it gives gibrish answers for simple prompts like "what is your name"
45 replies