zfmoodydub Posts - Answer Overflow

zfmoodydub

•Created by zfmoodydub on 3/23/2025 in #⚡｜serverless

RunPod Serverless Inter-Service Communication: Gateway Authentication Issues

I'm developing an application with two RunPod serverless endpoints that need to communicate with each other: Service A: A Node.js/Express API that receives requests and dispatches processing tasks Service B: A Python processor that handles data and needs to notify Service A when complete Service B successfully processes data but cannot reliably notify Service A about completion: Direct HTTP calls between services fail with connection errors: Error: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /api/webhook/completion (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object>: Failed to establish a new connection: [Errno 111] Connection refused')) RunPod API Gateway calls result in authentication failures (401): CopyGateway response status: 401 Gateway error: 401 Interestingly, manual API Gateway requests with identical payloads and headers work correctly. Core Questions Is there a networking limitation preventing direct connections between serverless containers? If so, what's the proper way to route traffic between them? When using the RunPod API Gateway to proxy a webhook request from one serverless endpoint to another, are there specific headers or formats that must be used for authentication to work correctly? I need to understand the proper pattern for inter-service communication in RunPod serverless environments and the correct authentication mechanism when using the API Gateway as an intermediary.

5 replies

RRunPod

•Created by zfmoodydub on 12/24/2024 in #⛅｜pods-clusters

PyTorch Pods never initializing - stuck waiting for logs

30 replies

RRunPod

•Created by zfmoodydub on 11/4/2024 in #⚡｜serverless

not getting any serverless logs using runpod==1.6.2

i had this problem with runpod==1.7.x a week or two ago. was told to downgrade to 1.6.2, which worked. as of today logs have stopped appearing.

2 replies

RRunPod

•Created by zfmoodydub on 10/24/2024 in #⚡｜serverless

Worker frozen during long running process

request ID: sync-f144b2f4-f9cd-4789-8651-491203e84175-u1 worker id: g9y8icaexnzrlr I have a process that should in theory take no longer than 90 seconds The template is configured to not timeout when i test the process via the requests tab in the UI, the logs for the process print smoothly until about halfway through the process, and then the logs disappear. The job never completes, and the worker goes idle after a minute or two. I cant see the logs to know if there is a failure or error. Does someone mind checking on this for me?

38 replies

RRunPod

•Created by zfmoodydub on 10/24/2024 in #⚡｜serverless

Runpod GPU use when using a docker image built on mac

I am building serverless applications that are supposed to be using gpu, while testing locally, the pieces that kick off functions that are meant to be using gpu are denoted with the common: device: str = "cuda" if th.cuda.is_available() else "cpu" this is required so that when running locally on a mac, the cpu device is used. I would think that in a docker image built on a mac, but with a amd64 machine type specified in the build command, that when its deployed on a server that has a cuda base image, cuda gpu would be used. but that does not seem to be the case. I have not been able to understand why that is for the longest time. My runpod serverless pods only show cpu usage when tested. Any advice?

25 replies

RRunPod

•Created by zfmoodydub on 10/23/2024 in #⚡｜serverless

Multiple endpoints within one handler

I have had success creating serverless endpoints in runpod with handler.py files that look like this: imports ... def handler(job): job_input = job['input'] ... return result runpod.serverless.start({"handler": handler}) Now im trying to deploy a severless endpoint with a handler that has two individual functions instead of one. With a traditional flask api set up, it would look like two app.routes. I understand runpod doesnt work with flask But if i were to want to structure a handler.py with multiple endpoints (multiple different def handler(job) methods), how would i do that? and then how would i call those endpoints? I would assume its like this: https://api.runpod.ai/v2/.../runsync/<endpoint1> https://api.runpod.ai/v2/.../runsync/<endpoint2> Can anyone assist me in solving this problem?

7 replies

RRunPod

•Created by zfmoodydub on 5/14/2024 in #⚡｜serverless

confusing serverless endpoint issue

After a successful call through run or runsync, i get my handler's success json. after about 5 seconds, the successful response json turns into this: Status Code 404 "error": "request does not exist" but the process was successful. Any ideas as to why? can provide the worker ids

14 replies

RRunPod

•Created by zfmoodydub on 3/12/2024 in #⚡｜serverless

Illegal Construction

When building a mock serverless endpoint, to test locally against test_input.json, i am not recieving the --- Starting Serverless Worker | Version 1.6.2 --- log in my container upon run. trying to run python handler.py manually in the exec window of my container returns an Illegal Construction Message.
Am i doing something stupid that is obviously wrong/ has anyone encountered an illegal construction message in their containers when trying to build a serverless endpoint?

42 replies

Gaming

Programming