PatriotSoul
PatriotSoul
RRunPod
Created by PatriotSoul on 6/5/2024 in #⚡|serverless
Loading LLama 70b on vllm template serverless cant answer a simple question like "what is your name"
Im using the instruct version. Just feels like its x10 quantized like the model is very stupid.
45 replies
RRunPod
Created by PatriotSoul on 6/5/2024 in #⚡|serverless
Loading LLama 70b on vllm template serverless cant answer a simple question like "what is your name"
I am just setting bfloat16 the rest i leave blank/default. When i load with web-ui, getting completely different responses.
45 replies