RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

llvmpipe is being used instead of GPU

I am a bit lost. I am planning on running waifu2x or real-esrgan but the output says it's using llvmpipe and the process is very slow. How can I make my container use GPU?...

1s delay between execution done and Finished message

I get almost one second of delay between a console message at the end of my handler and the "Finished" message. I am wondering why, and how to reduce this....
No description

Serverless is Broken

Something is clearly broken. Delay times are around 2 mins even when the same worker is getting a request in a row, it still takes 2 mins. It's not a cold start issue because even my normal cold starts don't take longer than 15 seconds.
No description

EU-RO-1 region severless H100 gpu not available ....

I used EU-RO-1 region serverless because I save the data in EU-RO-1 region. problem is there is no H100 GPU in EU-RO-1 region. I created the job in EU-RO-1 region serverless api. I waiting 6 hours but job status always in queue. how can i solve this? I cant use other region because my data saved in EU-RO-1 region.......

Workers wrongfully reported as "idle"

When I call my serverless api endpoint, instead of serving my request, it continues building the image while the worker was reported as "idle" and then "running" when called. So I cancel the request, but then the only way to make it stop (so it doesn't keep billing me) is deleting the worker....
No description

"Throttled" and re-"Initializing" workers everywhere today

Is there some incident going on with serverless today? I have 30 workers that are all "Throttled", other workers just disapear and others initialize instead of them all the time. every request that normally takes 10 seconds is taking minutes... This is true in multiple locations too. Most of my workers ended up in CA-MTL-1 but others in EU-* are displaying the same problems...

how to run flux+lora on 24 GB Gpu through code

I there , could anyone help me how can we inference the flux +lora using 24 GB Gpus Thanks...

Queue waiting 5+ minutes with dozens of idle workers

Lately I am often finding that the queue is sitting there with items that have been queued over 5 minutes, meanwhile there are dozens of idle workers. Why are the workers not picking up the queued items immediately? My application is in production and this delay on requests for seemingly no reason is not really acceptable. Thanks...

Serverless H200?

Hi when can we expect H200s to become available on serverless? My application could use the higher gpu memory

using compression encoding for serverless requests

Just wondering if the serverless endpoint is capable of receiving and processing compressed requests? (eg. zstd, gzip)

Throttled ECR Download?

We have a serverless endpoint that uses an ECR registery to back the image. When initializing a new worker the download of a changed layer, (which is a 3 GB) can sometimes take >20 minutes to download. Is this download speed typical? Is there another pattern we should be using? It's surprising that a pull from ECR is such a large bottleneck on our cold-start time....

Need some help to troubleshoot a configuration of a Serverless

I have created my account and subscribed to create a Serverless, I did set it up using the web interface. But it doesn't seem to work. I need some help ASAP.

Do Webhook Request Responses have a retry mechanism?

If a Response webhook fails is there a retry mechanism in place for resending the webhook again? If yes, what does it look like, i.e how many retries and for how long?...

Incorrect billing

the billing for last 4 weeks seems to be wrong, can someone help me understand. I am using only two serverless endpoints and no other services. Endpoint ids: ed0rivbjvv0x0u and pzfz3xhwa86raj
No description

Request getting stuck

Hey i am using runpod endpoint and all my request are stuck . its mission critical . I have raised a ticket , using network volume EU-SE-1

Serverles endpoint status and runsync not returning data anymore in request body (request not found)

Hey Team, I have a custom serverless endpoint worker. It always works. The logs always show that everything went as planned and the requests are always marked as completed after the time I expect. However, on my API the requests error out and on the UI they show completed but have no output. When I inspect the status on Thunderclient, runpod says that the request does not exist. I would like to understand what is going on and how I can make my api more resilient to these issues. Attached are screenshots of the behavior:...
No description

I want to increase/decrease workers by code, can you help?

I have a serverless setup already. Generally we keep 1 active worker in the actual time when we expect the traffic throughout the day, and at night when no one is using the application we make active workers 0 to avoid any charges. And then the next day, we make active workers 1 manually from runpod dashboard. We are willing to do that automatically. I know there is a GraphQL but I am not able to find relevant code to do that. Can anyone please help?...

Support for https://huggingface.co/deepseek-ai/DeepSeek-V3?

Would it be possible to get support for https://huggingface.co/deepseek-ai/DeepSeek-V3? as this is currently the best model for coding that is opensource

Serverless Idle Timeout is not working

One of my serverless endpoints is not respecting the idle timeout setting. Instead of staying active for 300 seconds, it turn to idle after 5. I have redeployed the endpoint, it work for a while, today again without any changes the endpoint turns idle after 5 seconds even though its set to 300....
Next