How do I handle both streaming and non-streaming request in a serverless pod?
How can I handle both effectively? Is it okay to have a handler witht both yield and return? i.e.
Will this work?
39 Replies
Seems it should work. Have you tried testing it locally? Runpod works really good locally, so it's easy to test
Generator Handler | RunPod Documentation
A handler that can stream fractional results.
Eh. runpod local stuff is pretty limited imo. It works for basic use case validation, but i tend to just run it against a GPU Pod instead since it seems to actually work by just directly calling the function.
I just skip the runpod.start({{}}) when im on a GPU Pod.
In terms of yield and return you should be able to, but I think using yield syntax both ways do not matter.
https://github.com/justinwlin/Runpod-OpenLLM-Pod-and-Serverless/blob/main/handler.py
GitHub
Runpod-OpenLLM-Pod-and-Serverless/handler.py at main · justinwlin/R...
A repo for OpenLLM to run pod. Contribute to justinwlin/Runpod-OpenLLM-Pod-and-Serverless development by creating an account on GitHub.
Nice way to run the same handler in both pod and serverless!
Thx! Yeah, I love being able to debug on GPU Pod, makes it much easier.
no unfortunately it's not easy to test locally, but I tried deploying it anyway and turns out whenever there's yield in the handler function everything I return becomes a generator? I can't get it to work yet
yeah it should be right? if you return using yield the datatype would be stream-like so its a streamed response
if endpoint == "stream_response": # i think this isnt how you retrieve inputs?
yield stream_response()
elif endpoint == "get_response": # i think this isnt how you retrieve inputs? and this too.
https://discord.com/channels/912829806415085598/948767517332107274/1235352455995199500
Maybe check if your updated runpod version? lol just random guess
leme check docs first
but my code that i link, works with both streaming / nonstreaming
i feel the docs don't show streaming surprisingly 👁️, at least on the client side that there is an endpoint you gotta hit called /stream
{
"id": "A_RANDOM_JOB_IDENTIFIER",
"input": { "key": "value" } # this is retrieved
}
your_handler.py
thats how you retrieve inputs
there is, only the endpoint i came looking for that before
oh interesting
maybe im just bad at looking
ah found it
Endpoint operations | RunPod Documentation
Comprehensive guide on interacting with models using RunPod's API Endpoints without managing the pods yourself.
yep yep
just gotta be patient haha
yeaaaaa, not the best doc xD
true
i find laravel docs way more easier to understand
A bit more involved imo is the only problem with the doc:
sorry, I mean whenever a yield is present in the handler, the output becomes a generator regardless if I use yield or return. for example
when something is false it also returns generator?
which endpoint are you hitting
yeah that's not how, i just wrote the example that way for convinience
have you tried the stream endpoint or the run and runsync?
whats the output like
i hit the /run endpoint first, then retrieve the stream by hitting /stream
sorry i can't produce screenshot right now, but in the local testing library, the output is somewhat like this:
assume that this is my code
it is expected if
something
is True
, but somehow when something
is False
, it returns a generator as welluse run instead hm
where is that from: "output: <generator object ...>"
haha sorry was dizzy, i meant /run
that's from the runpod local dev server log, the one that shows the output of the handler function
whats your code
just send it here i wanna see how do you print it
output
the
endpoint
's value is not stream_answer
, yet it always returns a generatorWhats is invoke(endpoint, payload)?
what is endpoint too?
maybe some of that returns a generator already
this is how the invoke method is, i omitted irrelevant codes but that's generally how the function is.
I got endpoint from this line in the handler function, which will pass job into a validator function that simply check if my payload schema is ok. basically the inputted payload should look like this:
basically that function validates and extract the endpoint variable from my input payload
i am very sure that this is not the case, tried and checked every other function, this stream function is a new addition to the code and before that (all the function above of the
stream_func
is the original code) no function ever coded to return a generatorwhy does it matter if its a generator or not?
If there is nothing else being sent, then the reponse is just done anyways
Whether you use yield or return
yeah true
you should debug, where the response gets to type "generator"
like logging from the returns, the variables
because the change of the response type would disrupt other running services, i am to avoid that
this code also returns a generator
I think it's just the way it is
thank you all
alright
i think this a problem with local testing
it shouldnt return a generator
should just be 1
returned a generator still, same code ran on serverless. no log to prove that it's a generator tho but it has the same behavior as before when it casted all my outputs to be a generator, it returns a [] and empty stream when streamed.
Does the normal streaming works?
You might need to contact runpod from web or email for this problem
the normal streaming works well, now we have 2 separate endpoints one to handle all non-streaming and one to handle all the streaming hahah
will try to do so
yeah nice fix for this