shensmobile
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
It caused me a lot of grief. I’m very glad it’s fixed but it would be great to get more details and what the mitigation plan will be in the future.
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
I wonder what happened
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Thanks!
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
WOOHOO
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
I optimally would like ot be in Canada
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
No, CA-MTL-1 is not a requirement
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Oh
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
For the endpoint template:
30 GB container disk
MODEL_NAME: my_model
BASE_PATH: /runpod-volume
HF_TOKEN: my_token
That's all the environment Variables that are set
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
I'm not sure which settings are important but:
24 GB GPU
3 workers, 1 GPUs/worker
5 second idle timeout
Flashboot enabled
CA-MTL datacenters
12.1,12.2,12.3,12.4 CUDA versions allowed
4 seconds queue delay
L4, A5000, 3090 GPU types
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Or should I try to copy all of the settings across?
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Can you see the endpoint configuration from the ID?
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Also, thank you so much for the help
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
I'm not sure which is the endpoint ID
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
I have these two:
vllm-nutty_teal_junglefowl
vllm-kejv5lkoiilruc
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Is there an easy way for me to export the configuration?
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
It looks like it says "Hello! How can I assist you today?" which completes what Postman received
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
So in the console/requests log, it looks like the full generation completed.
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
The request is too long to past here
45 replies
RRunPod
•Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Let me know what else I can supply to help
45 replies