lostdev
RRunPod
•Created by nimishchug on 9/27/2024 in #⚡|serverless
My output is restricted to no of tokens
Even the "max_tokens" placement in the examples here is wrong.
4 replies
RRunPod
•Created by nimishchug on 9/27/2024 in #⚡|serverless
My output is restricted to no of tokens
The documentation is bad and doesn't tell you why this is. Here is why: https://discord.com/channels/912829806415085598/1279829584749138109
4 replies
RRunPod
•Created by Encyrption on 8/27/2024 in #⚡|serverless
v1 API definitions?
great thank you
7 replies
RRunPod
•Created by Encyrption on 8/27/2024 in #⚡|serverless
v1 API definitions?
Is this code something simple you could share? I just started and have logging communication coming up on my TODO. No worries if you'd rather not.
7 replies
RRunPod
•Created by lostdev on 9/1/2024 in #⚡|serverless
Response is always 16 tokens.
For the curious, it was the
max_tokens
parameter, which I suspected but didn't know how to remedy. Turns out the proper way to set max_tokens
in the JSON body of the request is in a sampling_params
dictionary, and not a sibling to prompt
.
So intead of
it needs to be
which I couldn't figure out until I found the JobInput
class
2 replies