Misterion
RRunPod
•Created by Misterion on 12/11/2024 in #⚡|serverless
vllm worker OpenAI stream timeout
Nope
20 replies
RRunPod
•Created by Misterion on 12/11/2024 in #⚡|serverless
vllm worker OpenAI stream timeout
Yes, this waits for the whole request to finish.
Adding
stream=True
, sends the request which I can see in the dashboard, but it terminates the connection after ~1 min.20 replies
RRunPod
•Created by Misterion on 12/11/2024 in #⚡|serverless
vllm worker OpenAI stream timeout
I'm not sure how does MODEL_NAME affect this problem at all
20 replies
RRunPod
•Created by Misterion on 12/11/2024 in #⚡|serverless
vllm worker OpenAI stream timeout
yes this is what I meant, sorry
20 replies
RRunPod
•Created by Misterion on 12/11/2024 in #⚡|serverless
vllm worker OpenAI stream timeout
basically what I experience there is that server closes the connection after ~ 1 min in case stream == True, non-streaming works fine
20 replies
RRunPod
•Created by Misterion on 12/11/2024 in #⚡|serverless
vllm worker OpenAI stream timeout
MODEL_NAME is huggingface link as usual
20 replies
RRunPod
•Created by artbred on 9/15/2024 in #⚡|serverless
GGUF vllm
will do that as a workaround, but would be nice to support that natively
16 replies
RRunPod
•Created by artbred on 9/15/2024 in #⚡|serverless
GGUF vllm
we could download the model and pack it in the container, but I was just looking for out-of-the-box solution
16 replies
RRunPod
•Created by artbred on 9/15/2024 in #⚡|serverless
GGUF vllm
the problem is that you have to specify gguf file name, and belive there is no such env var for vllm worker
16 replies
RRunPod
•Created by artbred on 9/15/2024 in #⚡|serverless
GGUF vllm
hi, is there any solution to this?
16 replies
RRunPod
•Created by santiagomartinezbragado on 1/15/2024 in #⚡|serverless
Automate the generation of the ECR token in Serverless endpoint?
Are there any plans to add better support for ECR, or it's low priority?
12 replies
RRunPod
•Created by santiagomartinezbragado on 1/15/2024 in #⚡|serverless
Automate the generation of the ECR token in Serverless endpoint?
Hey, any solution to this? I wonder why this issue is so unpopular, everybody is using docker hub?
12 replies