R
RunPod3mo ago
houmie

Is serverless cost per worker or per GPU?

I'm looking at serverless GPU options and when looking at 48 GB GPU it costs $0.00048/s. But is that per worker or per GPU?
For example if I set the max workers to 3, will I be charged 3 x $0.00048/s if all three are in use? That would get very quickly very expensive... Thanks
6 Replies
ngagefreak05
ngagefreak053mo ago
Yes it is per GPU second, so cost will be number of gpu's used per second
houmie
houmie3mo ago
But the three workers will run in parallel right? That would mean all workers are utilising the same GPU? Or will they run on different GPUs of the same class?
ngagefreak05
ngagefreak053mo ago
they will utilize different gpu's
houmie
houmie3mo ago
I see. And they would still charge me only $0.00048 per second even though I have 5 workers running in parallel utilizing 5 different GPUs? In that case, it's not a bad deal. Please confirm if I understood this correctly. Thanks
houmie
houmie3mo ago
Actually it seems that's not true. Price is per worker.
No description
nerdylive
nerdylive3mo ago
Yep per worker per gpu
Want results from more Discord servers?
Add your server
More Posts
openai compatible endpoint for custom serverless docker imagehow can I get openai compatible endpoint for my custom docker image in runpod serverless. I am tryiSecurely using serverless endpoints on the client-side?I have a use case where I'm doing a client-server webapp that uses serverless endpoints. In order toI wanna use comfyUI for a img2vid workflow - can I do this via the serverless serviceI already tried setting up a pod yesterday where I uploaded all the needed models and stuff but todaUsing network volume with serverlessI am running a stateless model within serverless to modify provided image. I am wondering if the netPython modules missing when pod is startingWhen starting a comfy-ui pod after some downtime, I get a lot of messages of the kind ``` Import tHow to convert a template to serverless?Hi, I've been using Runpod for a while, I have a lot of templates for serverless inference. However,How do I handle both streaming and non-streaming request in a serverless pod?How can I handle both effectively? Is it okay to have a handler witht both yield and return? i.e. ``Runpod doesn't work with GCP artifact registyrRunpod doesn't work with GCP artifact registry. I even copied the complete json key and added it as Memory usage on serverless too highI finally managed to get the serverless setup working. I just sent a very simple post with a miniDoes RunPod serverless handler support FastAPI?I am trying to migrate an already existing FastAPI application running ML model to RunPod serverless