BadNoise
RRunPod
•Created by BadNoise on 9/19/2024 in #⚡|serverless
Error with the pre-built serverless docker image
yes always highly available and I'm using all the available regions
it's strange because if there are no gpus available it sould throw me an error, I mean, it's a big problem in production because a request can be stuck in loading forever/return an empty response (when response limit is set)
9 replies
RRunPod
•Created by BadNoise on 9/19/2024 in #⚡|serverless
Error with the pre-built serverless docker image
thank you for your time in the meanwhile 🙂
9 replies
RRunPod
•Created by BadNoise on 9/19/2024 in #⚡|serverless
Error with the pre-built serverless docker image
correct, it happens on this endpoint vllm-rxlyakgq58h7lf when running on 1 80GB GPU PRO
I'm running this model ModelCloud/Mistral-Large-Instruct-2407-gptq-4bit
It's always stucks on this log
2024-09-21T12:30:53.355632180Z (VllmWorkerProcess pid=161) INFO 09-21 12:30:53 model_runner.py:997] Starting to load model ModelCloud/Mistral-Large-Instruct-2407-gptq-4bit...
2024-09-21T12:30:54.669155104Z (VllmWorkerProcess pid=161) INFO 09-21 12:30:54 weight_utils.py:242] Using model weights format ['*.safetensors']
It doesn't happen all the time (maybe 30/40%) but as I have found on discord I'm not the only one with this problem, and once I delete the worker and start it again it runs smoothly
basically once the model is laoded and the machine is not in cooldown it can process requests, but once it turns off and turn of to process a new request - sometimes - it's sucks on that log, and I have to manually terminate the worker and run it again
I have tried with 2 80GB GPU (not pro), and at the moment its doesn't break, but the boot up time is increased a lot (from 30 seconds = when gpu pro is working, to 2 minutes)
9 replies
RRunPod
•Created by BadNoise on 9/19/2024 in #⚡|serverless
Error with the pre-built serverless docker image
yes I did 😅
9 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
but it's strange that even if I run stress test on it for over 1 minute it's never used 😅
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
so I have to remove torch and use pytorch and pytorch-cuda=12.1 right?
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
hi! thank you so much for your help, I will try with the suggested docker image 🙂
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
thank you 🙂
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
@nerdylive tried now, still 100% CPU usage and 0% for the GPU 😦
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
always ~5 seconds with 5 concurrent requests on a 32 vcpu
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
another strange thing is that on a cheap cpu on hugging face inference endpoint it performs faster than on a 24gb gpu on runpod (that's also why I think that is not using it) 😅
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
thanks for the tip, but I'm performing stress tests sending constantly requests for 1 minutes on it to understand how many requests it can handle so it's always running
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
sure no problem, I see 100% CPU usage and 0% for the GPU
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
let me try again cause I don't remember 😅 I'll launch the 32vcpu and let you know!
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
70 replies
RRunPod
•Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
if you'd prefer I can give you single files
70 replies