R
RunPod3w ago
Ben

Serverless request returns None from python client but web status says completed successfully

Hello, I have been baffled by this issue for weeks and im pulling my hair out. I have a serverless endpoint that always comes back as None from the python runpod client with no error messages in the logs or from my inference script. Yet the runpod.io metrics for my requests always show has completed. The logs have no errors, but its clearly not finishing, as the last thing I see in the logs is "processed frame 205/750" When I spin up a regular pod with the exact same image, then ssh into the pod and run the exact command I use in the entrypoint it works flawlessly. I have no idea what i could be missing.
No description
No description
9 Replies
Ben
BenOP3w ago
more info: This is my "staging endpoint" for a project im working on. I have an older version of my image in a different endpoint working flawlessly, so theres no crazy CICD / docker image related changes that I could attribute this to My endpoint accepts a video, and trying with a differe shorter video it also terminates arbitrarily in the middle of its progress the TLDR of my confusion and frustration is why isn't there an error message from either the runpod service or runpod pythong client!?
wuxmes
wuxmes3w ago
set a higher execution timeout prolly, exactly 60 secs feels like that might be the reason
Ben
BenOP3w ago
My execution timeout is 600s
wuxmes
wuxmes3w ago
are you running this in subprocess or directly in python handler? hard to say why its ending early without looking at code wait are you running a bash script in docker command as startup?
Ben
BenOP3w ago
ok so my inference script (run.py) is running on my laptop and is just using the runpod-python client to make a request with a sample video, call my serverless endpoint, and show the results. When I use my "release" endpoint, it works great. When I change the runpod endpoint ID to my "staging" endpoint im getting this issue my entrypoint on my docker image is CMD [ "python3", "-u", "/handler.py" ] Im also seeing these in my logs:
"message":"Failed to return job results. | 400, message='Bad Request', url='https://api.runpod.ai/v2/<ENDPOINT_ID>/job-done/4v7s7hbe3u4a52/a6a0096f-c893-45d7-bb2a-a78a75867edf-u1?gpu=NVIDIA+RTX+4000+Ada+Generation&isStream=false'"
"message":"Failed to return job results. | 400, message='Bad Request', url='https://api.runpod.ai/v2/<ENDPOINT_ID>/job-done/4v7s7hbe3u4a52/a6a0096f-c893-45d7-bb2a-a78a75867edf-u1?gpu=NVIDIA+RTX+4000+Ada+Generation&isStream=false'"
but I see them in my working endpoint as well so I didn't really worry I found 1 thread online when searching for that error and the github issue is still open I just pushed a new docker image version with more logging in my handler.py... including a print statement at the very end of my handler. Im getting these logs: The first line here is my own code's success message that gets printed at the very end of the handler
2024-12-07T21:11:02.666019804Z Finished processing video
2024-12-07T21:11:02.666028688Z {"requestId": "sync-404d83ab-3710-49c1-9fda-b0900d70814a-u1", "message": "Failed to return job results. | 400, message='Bad Request', url='https://api.runpod.ai/v2/182hdt7pnz6xwx/job-done/l39e5w9ar6se5j/sync-404d83ab-3710-49c1-9fda-b0900d70814a-u1?gpu=NVIDIA+RTX+A4000&isStream=false'", "level": "ERROR"}
2024-12-07T21:11:02.666040220Z {"requestId": "sync-404d83ab-3710-49c1-9fda-b0900d70814a-u1", "message": "Finished.", "level": "INFO"}
2024-12-07T21:11:02.666019804Z Finished processing video
2024-12-07T21:11:02.666028688Z {"requestId": "sync-404d83ab-3710-49c1-9fda-b0900d70814a-u1", "message": "Failed to return job results. | 400, message='Bad Request', url='https://api.runpod.ai/v2/182hdt7pnz6xwx/job-done/l39e5w9ar6se5j/sync-404d83ab-3710-49c1-9fda-b0900d70814a-u1?gpu=NVIDIA+RTX+A4000&isStream=false'", "level": "ERROR"}
2024-12-07T21:11:02.666040220Z {"requestId": "sync-404d83ab-3710-49c1-9fda-b0900d70814a-u1", "message": "Finished.", "level": "INFO"}
yasyf
yasyf2w ago
I'm seeing the same thing, the 400 saying failing to return job results, and then the SDK gets back None in the output. You ever solve?
nerdylive
nerdylive2w ago
open a ticket if you got something like this frequently
Ben
BenOP2w ago
I found the solution - basically my output could be written to JSON in python but whatever service in runpod that calls the serverless endpoint and forwards to the users (idk the names for all services in internal runpod arch) could NOT read it as valid JSON I fixed it, but, as a developer id like to see 1) an error message 2) a 500 error not a 400 error My specific error had to do with how I handed null vals
nerdylive
nerdylive2w ago
Wow makes sense, yeah if the runpod can't process json as output, it posts back the result to the system for return to the user, and the system can't process json it'll give 400 as bad request
Want results from more Discord servers?
Add your server