RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

First attempt at serverless endpoint - "Initializing" for a long time

Hi. New to RunPod, trying to run a serverless endpoint with a worker based on https://github.com/blib-la/runpod-worker-comfy and not able to get it past the "Initializing" status. There are NO logs anywhere in the console Here's what I did:...

(Flux) Serverless inference crashes without logs.

Hi All! I've built a FLUX inference container on Runpods serverless. It works (sometimes) but I get a lot of random failures and Runpods does not return me the error logs. E.g. this is the response: ...

Same request running twice

Hi, My request finished a successful run and then the same worker received the same request again and ran it. How could I fix this issue?...

serverless workers idle but multiple requests still in the queue

I have set scaling for spinning a new worker when a request is in queue for 30 secs, but no new idle worker is running except for active workers despite having multiple requests in the queue for more than 90 secs

Question about serverless vllm endpoint

I would like to deploy Qwen2VL-2B using vllm serverless. I know that It will create an endpoint that I can use to send a prompt. But I wonder if I could also send an image with prompt?

Serverless pod tasks stay "IN_QUEUE" forever

I have a TTS model that I've deployed flawlessly as a Runpod Pod, and I want to convert it to a serverless endpoint to save costs. Did an initial attempt, but when I send a request to the deployed serverless endpoint, the task just stays as "queued" forever. Last line of my dockerfile is
CMD ["python", "-u", "runpod.py"]
CMD ["python", "-u", "runpod.py"]
...

not getting any serverless logs using runpod==1.6.2

i had this problem with runpod==1.7.x a week or two ago. was told to downgrade to 1.6.2, which worked. as of today logs have stopped appearing.

Add Docker credentials to Template (Python code)

I struggle to find how to add my docker credentials to the template (Python code) - I have the credentials added to the settings in docker, but I can't find how to add them to the template. Anyone know how to do that? template = runpod.create_template( name=deployment_name, **TEMPLATE_CONFIG...

Format of video input for vLLM model LLaVA-NeXT-Video-7B-hf

Dear Discord members, I have a question about using the vLLM template with the HuggingFace LLaVA-NeXT-Video-7B-hf model on text+video multi-modal input. Video input is a fairly new feature in the vLLM library and I do not seem to find definitive information on how I should encode the input video so that the running model instance decodes it into the format it understands. The online vLLM AI chatbot suggested a vector of JPEG-encoded video frames but that did not work. The vLLM GitHub gave me the impression that a NumPy array is the right solution but this does not work either....

How to view monthly bills for each serverless instance?

I am currently running multiple serverless instances at the same time, and I need to see how much each of my serverless instances costs in a month (or day, week) so that I can balance my priorities in the development process. I found the “Billing” section in RunPod, and scrolling down, there is a “Billing Explorer/Runpod Endpoints” section as shown in the picture, but it does not display anything (even though I have spent over 300 USD on RunPod in 2 months). May I ask why nothing is showing up, if I did something wrong, and if there’s any other way to check the bill for each serverless instance? Any answers would be greatly appreciated; please provide your information ❤️...
No description

Issue with KoboldCPP - official template

I tried with two models (103b Midnight Miqu v1.0 and 123b Behemoth v1.1) in Q4 GGUF on a pod with the https://www.runpod.io/console/explore/2peen7lpau template. In both cases the models download successfully (2 files in both cases) When launching Kobold CPP the following error: Something possibly went wrong, stalling for 3 minutes before exiting so you can check for errors. ...

How to give docker run args like --ipc=host in serverless endpoints

How to give docker run args like --ipc=host in serverless endpoints

Is Runpod's Faster Whisper Set Up Correctly for CPU/GPU Use?

Hi, I'm currently using Faster Whisper provided by Runpod. https://github.com/runpod-workers/worker-faster_whisper While reviewing the code, I found something confusing:...
No description

Endpoint initializing for eternity (docker 45 Gb)

Hi! My docker image is about 45 Gb and I it has been about 20 hours since it started downloading it. https://www.runpod.io/console/serverless/user/endpoint/wzkav7ouarzdxv + there are 6 like this endpoints at the same time, all of them downloading Our gitlab docker registry has huge network output speed, I suppose it should not be botlnecked by it...

Llama-3.1-Nemotron-70B-Instruct in Serverless

Hello there, I've been trying to deploy Nvidia's Llama-3.1-Nemotron-70B-Instruct in serverless using vLLM template but I could not get it work no matter what. I'm trying to deploy it using an endpoint using 2 x H100 GPUs, but in my most attempts I don't even see weights being downloaded. Requests start and after few minutes worker terminates....

Failed to return job results

Hello, I built a WhisperX container that using runpod==1.7.4 and deploy it. When transcription is done and trying to return, it fails My worker id: rhxah8am1iugdc log:...

Job delay

Hello, I have been seeing an increase in delay time while workers boot up, even while using Flashboot. I am using 1.5.3 which seems to have improved it a bit but not significantly. Having said that, is there an API that can be called to boot up the worker if there is an incoming request in let's say 5 seconds. This would ensure that the worker is warm and ready when it arrives. Does idle timeout perform a similar function?

How to get `/stream` serverless endpoint to "stream"?

Example from official documentation: https://docs.runpod.io/sdks/javascript/endpoints#stream ``` from time import sleep import runpod...