RunPod

Join the RunPod server to ask questions!

Join

RunPod

Join the community to ask questions about RunPod and get answers from other members.

Join

⚡｜serverless

⛅｜pods

EgoDeath

3/19/2025

Failed Faster-Whisper task

I continue to get this error and I cant figure out whats going on, please help ❤️ Job submitted: 3f6a6e02-5249-4faf-9fb3-49ac501c695d-u2 Job failed: {'delayTime': 163, 'error': '{"error_type": "<class 'av.error.InvalidDataError'>", "error_message": "[Errno 1094995529] Invalid data found when processing input: '/tmp/tmpi89o0mcn.wav'", "error_traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\", line 134, in run_job\n handler_return = handler(job)\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/utils/rp_debugger.py\", line 165, in call\n result = self.function(*args, **kwargs)\n File \"/rp_handler.py\", line 72, in run_whisper_job\n whisper_results = MODEL.predict(\n File \"/predict.py\", line 75, in predict\n segments, info = list(model.transcribe(str(audio),\n File \"/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py\", line 277, in transcribe\n audio = decode_audio(audio, sampling_rate=sampling_rate)\n File \"/usr/local/lib/python3.10/dist-packages/faster_whisper/audio.py\", line 46, in decode_audio\n with av.open(input_file, metadata_errors=\"ignore\") as container:\n File \"av/container/core.pyx\", line 401, in av.container.core.open\n File \"av/container/core.pyx\", line 272, in av.container.core.Container.cinit\n File \"av/container/core.pyx\", line 292, in av.container.core.Container.err_check\n File \"av/error.pyx\", line 336, in av.error.err_check\nav.error.InvalidDataError: [Errno 1094995529] Invalid data found when processing input: '/tmp/tmpi89o0mcn.wav'\n", "hostname": "187psw4ygtfrrm-64410c48", "worker_id": "187psw4ygtfrrm", "runpod_version": "1.5.2"}', 'executionTime': 184, 'id': '3f6a6e02-5249-4faf-9fb3-49ac501c695d-u2', 'status': 'FAILED', 'workerId': '187psw4ygtfrrm'}...

ozzie

3/18/2025

Delete Serverless Endpoint via the API?

I am trying to delete the serverless endpoint via the API, but everytime I make a request to the endpoint, I get an internal error: Via the Python API: ``` delete_endpoint_graphql = """mutation {{...

Solution:

Does it work when you use the REST API? https://rest.runpod.io/v1/docs#tag/endpoints/DELETE/endpoints/{endpointId}

Cemal

3/18/2025

Terminate worker

Hi y'all, is there any way to terminate specific worker (serverless) via api or as additional control in handler return? I do not want to refresh the worker i just want to terminate at my special occasions....

zaid

3/18/2025

Is it possible to response with Transfer-Encoding: Chunked

Hello, I'm using serverless endpoints. Currently return a JSON object. Is it possible to lets say directly return a wav file with Transfer-Encoding chunked So the response headers would be Content-Type: audio/wav...

WeamonZ

3/17/2025

disk quota exceeded serverless runpod github

Hi, I'm getting a disk quota exceeded when trying to build my runpod serverless from a github repo. It downloads a few models. Is there a maximum quota size ? ...

Solution:

okay, the build has timeouts too, so try to optimize for that

ammar

3/17/2025

Ollama serverless?

is thaty any easy way to run ollama over serverless?

muggleborntribute#0

3/17/2025

Serverless docker image deployment

Hi, I finetuned a lora from llama 3.2 3B using unsloth. and want to deploy that on serverless. Using vLLM with merged model degrades the performance too much to be of use. I then, followed instructions from this link https://github.com/runpod-workers/worker-template/tree/main and created a serverless endpoint using the docker image. but it keeps on initializing and does not complete one job. job remains in queue. I might be missing something. I also don't have much experience with docker. I might be making a mistake there. But I did test the docker locally before deploying. I would appreciate any help regarding this....

Yebs

3/17/2025

Can you now run gemma 3 in the vllm container?

In the serverless, its seems im getting an error, any help on this

AMooMoo

3/17/2025

"Max Retries Reached"

For some reason, I see this error wayy more commonly now than before, is there a reason?

Vlad

3/16/2025

"Something went wrong" trying to create a new endpoint

I'm trying to create a new serverless endpoint, and I just get an error saying "Something went wrong. Please try again later or contact support. Something went wrong. Please try again later or contact support. Something went wrong. Please try again later or contact support." when I hit the final create endpoint. No choices seem to impact this (I get this regardless of what GPUs I choose, number of workers, network volume or not, etc.). Any ideas?

Realised Prophets

3/15/2025

Faster-Whisper output "None" — log 400 "Bed request"

Hi! I’m using the prebuilt Faster Whisper serverless endpoint to transcribe audio files. I send a transcription request using a direct-download Google Drive URL, and the job completes (status COMPLETED) but returns no transcription text—the output field is None. The logs show repeated 400 “Bad Request” errors when trying to return job results. It works locally, but it appears the prebuilt container isn’t packaging the transcription result properly or in a different format. Any ideas what could b...

waelx

3/15/2025

Can someone help me integrate a JS docker endpoint that executes FFMPEG?

No matter what I try, the job never gets processed and stays "in_queue". Please help

Anders

3/15/2025

Anyone get vLLM working with reasonable response times?

Seems that no matter how I configure serverless with vLLM, the workers are very awkward in picking up tasks and even with warm containers tasks in the queue sit around for minutes for no obvious reason. Has anyone actually been able to use serverless vLLM for a production use case?

EMPZ

3/13/2025

Network is still PAINFULLY slow

This is maybe the 3rd or 4th time I post this here or send a support request about this issue in the past 6 months. I've moved my production workloads to another GPU cloud provider because of this. Today, I wanted to see if things were better now. They are not. Testing out the new GitHub integration, my build that pulls a model from HuggingFace has been running for almost 10 minutes now because it's downloading a 1GB model weight at 500 kbps.... Once again I ask, are you planning to improve this situation once and for all some day? I moved out to another platform because of this. The fact that it happens on the build, okay, might not be terrible as we're not charged for build time I understand. But I've seen this issue during serverless execution where uploading a 10 MB file would take almost 90 seconds. And these are seconds we're billed for at a GPU price tier....

Mihály

3/13/2025

⚠ Hundreds of unexplained requests coming in

🔴 We have started to experience a sudden influx of requests on all of our production endpoints, first started a month ago, but now its getting more frequent, like every day in the past 2-3 days, around this time. (when i posted this) These requests are not coming from us, since the jobs that we actually submitted and are in the queue are below a 100 at this timeframe. I've already rotated the API keys, but its still happening. Any way to figure this out?...

HyperGaming

3/13/2025

Need help with hosting a vton model on serverless

I'm trying to run the CatVTON model (https://github.com/Zheng-Chong/CatVTON, https://huggingface.co/spaces/zhengchong/CatVTON) using RunPod serverless. (Any vton model is fine as long as I can get the output) My Goal: A Flutter app sends an image to a Flask backend....

Yebs

3/12/2025

roll out progress taking a while

it been's 3 hours

Guray

3/12/2025

Can't access private Google Artifact Registry (gcr.io) images

Hello, I can't connect private Google Artifact Registry (aka. Google Container Registry) images, because Google simply doesn't provide a username and password credentials for us. Is there a way to pass credentials to serverless worker config like how circleci did https://discuss.circleci.com/t/authenticated-docker-pulls-for-gcp-artifact-registry/42194 ?

柠檬板烧鸡

3/12/2025

How to optimize batch processing performance？

Use serverless to deploy Qwen/Qwen2-7B model GPU: Nivada A40 48G Environment variables:...

batch_processing.py

Xqua

3/11/2025

Do you cache docker layers to avoid repulling ?

I have a question, does the current serverless system caches the docker layers to avoid repulling ? Let's assume I make a docker image with the first layers being the AI models, if I upload this, the dokcer repo will only get the new ones and won't update layers it already knows If I pull the same will happen ...

Previous Next

Gaming

Programming

RunPod

Join the RunPod server to ask questions!

RunPod

Join the community to ask questions about RunPod and get answers from other members.