is stream a POST endpoint or GET endpoint (locally)?
Is stream a POST endpoint or GET endpoint. Trying to run the handler locally before hosting it on runpod with streaming but it's not working.
Noticed the example here: https://doc.runpod.io/reference/llama2-13b-chat#streaming-token-outputs
where /stream is a GET endpoint
But when I check the fastapi swagger API it shows it's a POST endpoint, which ofc doesn't stream but rather return all results together.
Here is my simplified handler:
and by client which fails with /stream (GET) (method not allowed error) locally.
12 Replies
@ashleyk can you help?
Nope
do you know anyone who can? runpod team?
@flash-singh ?
The RunPod devs are in the US so they will probably only be online in a few hours.
@Merrell will probably be able to advise.
both GET and POST work
none of them works for me
Don't think it works locally.
Then I'll try to deploy it. But if so, I guess it is not an ideal behavior since testing on Prem is not very convenient
https://blog.runpod.io/runpod-dockerless-cli-innovation/
@goku Can try this.it uses gpu pods to live reload against a developer env / have ur endpoints there for testing. havent tried for /stream, but might work for ur situation
RunPod Blog
RunPod's Latest Innovation: Dockerless CLI for Streamlined AI Devel...
Discover the future of AI development with RunPod's Dockerless CLI tool. Experience seamless deployment, enhanced performance, and intuitive design, revolutionizing how you bring AI projects from concept to reality.
what does dockerless means?
it builds on dev env then uploads?
It a bad naming, in the docs it called “Projects” under runpodctl
Docs still need to be drastically improved imo.
But all it means is they want u to be able to develop a handler.py without worrying too much about setting up ur own custom docker image.
Tho arguably I think it still not the best workflow if u need dependencies outside of python they dont handle it the best rn , either requiring u to do it through another terminal to apt-get install, or “your own custom docker image” as a base with like other special dependencies installed.
But overall i think it is a nice development flow since they have the endpoints automatically there to test when i played with it before, just personally my workflows are simple so I just test against gpu cloud manually
The idea is u write a handler.py and it live refreshes against a pod in the background for u to test against
and the pod also auto stops / shutdown if u disconnect after a certain amt of time
Ah ye