octopus
octopus
RRunPod
Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
cuz I'm not sure how much I'm paying. Also I thought runpod pricing was cheapest out there but then this ad from novita.ai showed up saying it is 50% cheaper than runpod https://novita.ai/gpu-instance/console/serverless
16 replies
RRunPod
Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
I did but shouldn't that reflect in the price?
16 replies
RRunPod
Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
so when is the 30% off applied?
16 replies
RRunPod
Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
I sent you above.. if you see in the bottom right
16 replies
RRunPod
Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
but the price on the main page shows $0.00046/s in the first screenshot
16 replies
RRunPod
Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
No description
16 replies
RRunPod
Created by Thibaud on 8/8/2024 in #⚡|serverless
can't run 70b
@Thibaud were you able to get the execution time lowered? I compared mlabonne/Llama-3.1-70B-Instruct-lorablated with Llama-70B-3.0 (https://huggingface.co/failspy/Meta-Llama-3-70B-Instruct-abliterated-v3.5) which is the original that 3.1 is based on and the difference is striking. 3-5secs for 3.1 vs only 0.6-0.8secs for 3.0
75 replies
RRunPod
Created by houmie on 6/28/2024 in #⚡|serverless
vLLM serverless throws 502 errors
I'm getting this error too for vllm. Did anyone find a solution? About 5% of requests end up getting failed with this error
11 replies
RRunPod
Created by octopus on 6/11/2024 in #⚡|serverless
Cannot run Cmdr+ on serverless, CohereForCausalLM not supported
No description
8 replies
RRunPod
Created by octopus on 6/11/2024 in #⚡|serverless
Cannot run Cmdr+ on serverless, CohereForCausalLM not supported
this is the model we tried: https://huggingface.co/alpindale/c4ai-command-r-plus-GPTQ
8 replies
RRunPod
Created by octopus on 6/11/2024 in #⚡|serverless
Cannot run Cmdr+ on serverless, CohereForCausalLM not supported
tried that this is the error we get:
return future.result()/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 370, in _load_chat_template
2024-06-12T04:16:52.930112985Z [rank0]: with open(chat_template, "r") as f:
2024-06-12T04:16:52.930128535Z [rank0]: TypeError: expected str, bytes or os.PathLike object, not dict
return future.result()/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 370, in _load_chat_template
2024-06-12T04:16:52.930112985Z [rank0]: with open(chat_template, "r") as f:
2024-06-12T04:16:52.930128535Z [rank0]: TypeError: expected str, bytes or os.PathLike object, not dict
8 replies
RRunPod
Created by octopus on 6/10/2024 in #⚡|serverless
What quantization for Cmdr+ using vLLM worker?
@digigoblin can I use the original CohereForAI/c4ai-command-r-plus then? what parameter values should I input and vRAM GPU is needed to run it? alternately I tried this alpindale/c4ai-command-r-plus-GPTQ but it seems to give some error saying ' CohereForCausalLM is not supported'
12 replies
RRunPod
Created by octopus on 6/10/2024 in #⚡|serverless
What quantization for Cmdr+ using vLLM worker?
12 replies
RRunPod
Created by octopus on 6/10/2024 in #⚡|serverless
What quantization for Cmdr+ using vLLM worker?
@aikitoria
12 replies
RRunPod
Created by octopus on 2/29/2024 in #⚡|serverless
Serverless calculating capacity & ideal request count vs. queue delay values
@flash-singh any idea?
4 replies
RRunPod
Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
Gotta give @ashleyk a job at this point, he helps everyone
46 replies
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
awesome! thanks!
48 replies
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
Cool! yeah the casperhansen/mixtral-instruct-awq worked with your settings.
48 replies
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
It’s the loader I’m not sure abt the quantization
48 replies
RRunPod
Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
Exllamav2_HF is not supported?
48 replies