RunPod

R

RunPod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

is there example code to access the runpod-worker-comfy serverless endpoint

Hi, I have managed to run the runpod-worker-comfy serverless endpoint. and I know it supports for 5 entries: RUN, RUNSYNC, STATUS, CANCEL, HEALTH. but I do not exactly know how to access the service from my python code. like how to prepare the api-key, the worker id, how to prepare the request for RUN, and how to check the status until it is finished, and download the generated image. anywhere exists a example code to do these basic operation from my python code? Previously I have python code to communicate directly with the comfyUI server, which will create a websocket, send the workflow with http post, keep checking the history, once the work is done, read the image from the output which passed through the websocket connection. but when wrapped with runpod-worker-comfy, indeed, the interface is more easy, and there is input validation which is great. but I do not know how to use it from my code, and did not find any example code to access it, sorry for my ignorance....

Backup plan for serverless network outage

Is this network outage affecting both serverless and on-demand pods? If these two services' outages don't occur simultaneously, can we use pods to mitigate the serverless network outage? What we need is a stable and reliable service....

delay time

I have a serverless worker, which is configured to have 15 max workers. However, I notice that only about three of them are actually usable. My workload is configured to timeout if it takes longer than a minute to process. The other workers randomly have issues such as timing out when attempting to return job data or completely failing to run and having to be retried on a different worker, leading to a delay/execution time of over 2-3 minutes Executing 6 different jobs all have very different delay times. Some worker ids are consistenly low delay time but some randomly take forever. Is there anything I can do to lower this randomness? Additionally can I delete/blacklist these workers that perform poorly...
No description

update worker-vllm to vllm 0.5.0

vLLM just got bumped to 0.5.0 with significant features being ready for production. @Alpay Ariyak FP8 is very significant but so is speculative decoding and prefix caching. - FP8 support is ready for testing. By quantizing the portion model weights to 8 bit precision float point, the inference speed gets 1.5x boost....
Solution:
For sure, already in progress!

SDXL Quick Deploy through Runpod Doesn't work

I sent a request in to test it such as below, and it threw an error. There are other alternatives, so this is not the end of the world for me, but I wanted to give feedback that I don't believe it works. ``` { input: { prompt: "A cute cat"...

Video processing

Hey, What are your approaches and/or recommendations for processing videos in serverless workers?...

can 3 different serverless workers running from same network volume?

Hi @digigoblin I have checked your answer about symbol linking network volume dir to serverless dir and run worker from the network volume as it did as if it was a separate pod instance. https://github.com/ashleykleynhans/runpod-worker-comfyui/blob/main/start.sh#L5-L7 ...

Can serverless endpoints make outbound TCP connections?

I know endpoints can make http/https requests but is there any limit on outbound connections? Is there a FW or are all ports open? What about bandwidth limitations, etc.? Thanks!

Very slow cold start times

Does anyone know why I would get such variable cold start times, anything from half a second to 90 seconds? I'm using the standard vLLM Serverless template

Uploading a file to network volume takes forever and fails after a few mins

I'm trying to upload a checkpoint file which is about 650MB, and the upload speed is about 5~10MB per min. And after a few mins, it fails with "Invalid response: 524" I'm attaching a screen shot. How can I resolve this issue?...
No description

Cannot run Cmdr+ on serverless, CohereForCausalLM not supported

I'm getting this error for all Cmdr+ models on serverless:
Error initializing vLLM engine: Model architectures ['CohereForCausalLM'] are not supported for now.
Error initializing vLLM engine: Model architectures ['CohereForCausalLM'] are not supported for now.
Although in vLLM issues we see that CohereForCausalLM is supported...

vLLM Serverless error

When using the vLLM Serverless template I get the following error when trying to use the model - cognitivecomputations/dolphin-2.9-llama3-8b: HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name...
Solution:
Its fixed, it was due to the thing you just posted in #🚨|incidents @haris

Environment Variables not found

I have existing serverless endpoints that were running without an issue. I run a check on start up to see if env vars exist, if not, I throw an error. There was no issue with the endpoint previously but now it throws because of missing variables. I have checked the template, and the env vars still exist there. Were any changes made in the last month that would cause this issue?...

Does it only accept python language?

I have some text processing in NodeJs that I would like to upload to server ess to see if it runs faster
Solution:
Yes, JS SDK is incomplete (basically a work-in-progress) so you would need to use the Python SDK

How can I connect my network volume to a serverless endpoint?

Hi all, I know that the serverless endpoints can have access to network volumes, but I can't seem to actually make it work. I'm also a first timer when it comes to docker\serverless so i may be doing some very trivial things wrong. I connected my servereless endpoint to my network volume when setting it up in the UI, but when the endpoint tries to access files - I used 'pwd' on the files' location when the storage was connected to a pod, and placed it as an environment variable - I get "no file\directory", so to be clear, for instance, the serverless endpoint calls python3 /workspace/ComfyUI/main.py and fails because it doesn't exist, even though it does. Do I need to prefix this in some manner? Call runpod-volume/workspace/ComfyUI/main.py? Create a directory called runpod-volume in my network volume and place everything there? can I even start the comfyui process from within the network volume? I do it like this because that's how I use it on my regular pods, and I use many custom nodes and don't want to have to re-download them on every request. I'd appreciate anyone's help, the examples and tutorials online are very abstract and specific to a1111......

Requests remain in status "IN QUEUE" and nothing happens

I deployed custom docker image, run in serverless pod, all workers running without errors, in logs no errors too, what should i do to fix it?
Solution:
Yeah, i think digigoblin is right. probably need to call the handler.py from somewhere. ``` call_python_handler() { echo "Calling Python handler.py..."...

Anyone have example template for OpenVoice V2 serverless?

I would like to deploy https://github.com/myshell-ai/OpenVoice on serverless. It has a huggingface module can it be implement in the Serverless vLLM quick deploy? If so are there any instructions for doing so? If not what are my options for getting it installed?
Solution:
No, it's for vllm supported models only, it doesnt suport all types of pipeline

What quantization for Cmdr+ using vLLM worker?

I'm trying to set up these Cmdr+ models on serverless using the vllm worker but the only options I see are SqueezeLLM, AWQ and GPTQ. Which quantization to set while starting these models?: https://huggingface.co/CohereForAI/c4ai-command-r-plus-4bit and ...

CPU Instances on 64 / 128 vCPUs FAIL

I can deploy my app on all instances except for 64 & 128 vCPU. Both of these run on AMD EPYC 9754 128-Core Processor. When it tries to run it gets stuck in QUEUE with the error (pasted below). When this happens it then just loops between "start container" and "failed to create shim task: the file python was not found: unknown". Any ideas what is causing this and how to resolve? There is similar issue reported in pods section here but I am using serverless and getting same problem. ERROR f...