Serverless request returns None from python client but web status says completed successfully
Hello, I have been baffled by this issue for weeks and im pulling my hair out.
I have a serverless endpoint that always comes back as None from the python runpod client with no error messages in the logs or from my inference script.
Yet the runpod.io metrics for my requests always show has completed.
The logs have no errors, but its clearly not finishing, as the last thing I see in the logs is "processed frame 205/750"
When I spin up a regular pod with the exact same image, then ssh into the pod and run the exact command I use in the entrypoint it works flawlessly.
I have no idea what i could be missing.
9 Replies
more info: This is my "staging endpoint" for a project im working on. I have an older version of my image in a different endpoint working flawlessly, so theres no crazy CICD / docker image related changes that I could attribute this to
My endpoint accepts a video, and trying with a differe shorter video it also terminates arbitrarily in the middle of its progress
the TLDR of my confusion and frustration is why isn't there an error message from either the runpod service or runpod pythong client!?
set a higher execution timeout prolly, exactly 60 secs feels like that might be the reason
My execution timeout is 600s
are you running this in subprocess or directly in python handler? hard to say why its ending early without looking at code
wait are you running a bash script in docker command as startup?
ok so my inference script (run.py) is running on my laptop and is just using the runpod-python client to make a request with a sample video, call my serverless endpoint, and show the results.
When I use my "release" endpoint, it works great. When I change the runpod endpoint ID to my "staging" endpoint im getting this issue
my entrypoint on my docker image is
CMD [ "python3", "-u", "/handler.py" ]
Im also seeing these in my logs:
but I see them in my working endpoint as well so I didn't really worry
I found 1 thread online when searching for that error and the github issue is still open
I just pushed a new docker image version with more logging in my handler.py... including a print statement at the very end of my handler. Im getting these logs:
The first line here is my own code's success message that gets printed at the very end of the handler
I'm seeing the same thing, the 400 saying failing to return job results, and then the SDK gets back
None
in the output. You ever solve?open a ticket if you got something like this frequently
I found the solution - basically my output could be written to JSON in python but whatever service in runpod that calls the serverless endpoint and forwards to the users (idk the names for all services in internal runpod arch) could NOT read it as valid JSON
I fixed it, but, as a developer id like to see
1) an error message
2) a 500 error not a 400 error
My specific error had to do with how I handed null vals
Wow makes sense, yeah if the runpod can't process json as output, it posts back the result to the system for return to the user, and the system can't process json it'll give 400 as bad request