Alpay Ariyak
Alpay Ariyak
RRunPod
Created by nerdylive on 7/1/2024 in #⚡|serverless
VLLM WORKER ERRROR
Only h100s and L40s support fp8
24 replies
RRunPod
Created by octopus on 6/25/2024 in #⚡|serverless
Distributing model across multiple GPUs using vLLM
Yeah that’s a vllm issue, it doesn’t allow 6 or 10
10 replies
RRunPod
Created by octopus on 6/25/2024 in #⚡|serverless
Distributing model across multiple GPUs using vLLM
You don't need it, as it's automatically set to the number of GPUs of the worker
10 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Sorry for the delay
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
This was fixed!
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
All others are good
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
This seems isolated to that and US-OR
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Is CA-MTL-1 a requirement for you?
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Please do for now, I don’t have access atm to the settings
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Of course!
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
The second one, I agree its confusing to tell which is the id haha
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
And your endpoint id please
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Can you share your entire endpoint configuration
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
we're still looking into this
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Okay, that's great to know, so issue is outside of worker
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
After you send the streaming request and it finishes, can you go to the console and check status of that request, it should show full output from worker, need to see if it’s also cut off there
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Were you streaming w openai compatibility or not?
45 replies
RRunPod
Created by shensmobile on 6/13/2024 in #⚡|serverless
vLLM streaming ends prematurely
Could you share full output?
45 replies
RRunPod
Created by octopus on 6/11/2024 in #⚡|serverless
Cannot run Cmdr+ on serverless, CohereForCausalLM not supported
Can you share your env vars
8 replies
RRunPod
Created by Casper. on 6/12/2024 in #⚡|serverless
update worker-vllm to vllm 0.5.0
For sure, already in progress!
4 replies