Nelson
Nelson
RRunPod
Created by Nelson on 1/2/2025 in #⚡|serverless
Serverless SGLang - 128 max token limit problem.
Many thanks for your support!!! you ara genius, I finally could make it work with the following ..... REQUEST { "input": { "text": "Tell me a story about three bananas who solve the case of the missing hamburger.", "sampling_params": { "max_new_tokens": 5000, "temperature": 0 } } } I'm using the sglang now. Thanks again.
19 replies
RRunPod
Created by Nelson on 1/2/2025 in #⚡|serverless
Serverless SGLang - 128 max token limit problem.
I'm using this model https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct but I've also tried with Mistral and I had the same results. And you are ok, I tried to use the Open AI template but I'm still using the this URL POST https://api.runpod.ai/v2/<MODEL_ID>/run Reading the documentation tells me to use this URL base_url="https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1", but didn't work ...
19 replies
RRunPod
Created by Nelson on 1/2/2025 in #⚡|serverless
Serverless SGLang - 128 max token limit problem.
Thanks for your answers. Unfortunatelly I tested all the recommendations, and the usage of the openai library with no luck ... here you have the example of what I'm sending and the 16 token I've as return from the vllm endpoint... REQUEST { "input": { "messages": [ { "role": "user", "content": "What is AI?" } ], "temperature": 0.7, "max_tokens": 500 } } ANSWER { "delayTime": 406, "executionTime": 393, "id": "14de16db-8b3d-444e-9358-9d4a001c61b9-u1", "output": [ { "choices": [ { "tokens": [ "Artificial Intelligence (AI) refers to the development of computer systems that can perform" ] } ], "usage": { "input": 40, "output": 16 } } ], "status": "COMPLETED", "workerId": "hn4gcdunggpaoc" } As I said at the beginning I continue with 128 tokens for the sglang and 16 for the vllm one ....
19 replies
RRunPod
Created by Nelson on 1/2/2025 in #⚡|serverless
Serverless SGLang - 128 max token limit problem.
The same happens but in this case the every time the maximum is 16 😦 REQUEST { "input": { "prompt": "Write a poem about nature." } } OUTPUT { "delayTime": 432, "executionTime": 299, "id": "4bac6d1c-6f5b-454a-a2a0-419cf6a6ecbf-u1", "output": [ { "choices": [ { "tokens": [ " \nThe sun shines bright in the morning sky,\nA fiery hue, that catches" ] } ], "usage": { "input": 7, "output": 16 } } ], "status": "COMPLETED", "workerId": "gzgn2o5m506yzj" }
19 replies