octopus
RRunPod
•Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
cuz I'm not sure how much I'm paying. Also I thought runpod pricing was cheapest out there but then this ad from novita.ai showed up saying it is 50% cheaper than runpod https://novita.ai/gpu-instance/console/serverless
16 replies
RRunPod
•Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
I did but shouldn't that reflect in the price?
16 replies
RRunPod
•Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
so when is the 30% off applied?
16 replies
RRunPod
•Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
I sent you above.. if you see in the bottom right
16 replies
RRunPod
•Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
but the price on the main page shows $0.00046/s in the first screenshot
16 replies
RRunPod
•Created by octopus on 11/13/2024 in #⚡|serverless
What is the real Serverless price?
16 replies
RRunPod
•Created by Thibaud on 8/8/2024 in #⚡|serverless
can't run 70b
@Thibaud were you able to get the execution time lowered? I compared mlabonne/Llama-3.1-70B-Instruct-lorablated with Llama-70B-3.0 (https://huggingface.co/failspy/Meta-Llama-3-70B-Instruct-abliterated-v3.5) which is the original that 3.1 is based on and the difference is striking. 3-5secs for 3.1 vs only 0.6-0.8secs for 3.0
75 replies
RRunPod
•Created by houmie on 6/28/2024 in #⚡|serverless
vLLM serverless throws 502 errors
I'm getting this error too for vllm. Did anyone find a solution? About 5% of requests end up getting failed with this error
11 replies
RRunPod
•Created by octopus on 6/11/2024 in #⚡|serverless
Cannot run Cmdr+ on serverless, CohereForCausalLM not supported
8 replies
RRunPod
•Created by octopus on 6/11/2024 in #⚡|serverless
Cannot run Cmdr+ on serverless, CohereForCausalLM not supported
this is the model we tried:
https://huggingface.co/alpindale/c4ai-command-r-plus-GPTQ
8 replies
RRunPod
•Created by octopus on 6/11/2024 in #⚡|serverless
Cannot run Cmdr+ on serverless, CohereForCausalLM not supported
tried that this is the error we get:
8 replies
RRunPod
•Created by octopus on 6/10/2024 in #⚡|serverless
What quantization for Cmdr+ using vLLM worker?
@digigoblin can I use the original
CohereForAI/c4ai-command-r-plus
then? what parameter values should I input and vRAM GPU is needed to run it? alternately I tried this alpindale/c4ai-command-r-plus-GPTQ
but it seems to give some error saying ' CohereForCausalLM is not supported'12 replies
RRunPod
•Created by octopus on 6/10/2024 in #⚡|serverless
What quantization for Cmdr+ using vLLM worker?
@aikitoria said here that vllm was supporting cmdr+ https://discord.com/channels/912829806415085598/948767517332107274/1230643876763537478
12 replies
RRunPod
•Created by octopus on 6/10/2024 in #⚡|serverless
What quantization for Cmdr+ using vLLM worker?
@aikitoria
12 replies
RRunPod
•Created by octopus on 2/29/2024 in #⚡|serverless
Serverless calculating capacity & ideal request count vs. queue delay values
@flash-singh any idea?
4 replies
RRunPod
•Created by ashleyk on 2/26/2024 in #⚡|serverless
Unacceptably high failed jobs suddenly
Gotta give @ashleyk a job at this point, he helps everyone
46 replies
RRunPod
•Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
awesome! thanks!
48 replies
RRunPod
•Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
Cool! yeah the
casperhansen/mixtral-instruct-awq
worked with your settings.48 replies
RRunPod
•Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
It’s the loader I’m not sure abt the quantization
48 replies
RRunPod
•Created by octopus on 2/26/2024 in #⚡|serverless
Help: Serverless Mixtral OutOfMemory Error
Exllamav2_HF is not supported?
48 replies