Heartthrob10
RRunPod
•Created by Heartthrob10 on 8/1/2024 in #⚡|serverless
how to set a max output token
Hi, I deployed a finetuned llama 3 via vllm serverless on runpod. However, I'm getting limited output tokens everytime. Does anyone know if we can alter the max output tokens while sending the input prompt json?
12 replies