Exposing http ports on serverless
There's no way to expose http ports on serverless is there? When I'm creating a new template and flip the template type from Pod to Serverless that option goes away.
66 Replies
I see stuff in the codebase about
RUNPOD_REALTIME_PORT
but I'm not sure what the use case is for that and I haven't found documentation for it. Also I'd like to expose two ports to my serverless instancesWhy would you want to expose HTTP ports on serverless? Its not designed for that. Use pods instead.
I'm creating a version of this https://www.instagram.com/p/C8CzGuOubKp/ that anybody can use without setup. Sending frames over a websocket connection at 24fps.
I have it working with pods, but I'd rather not manage standing up and tearing down the instances for each user
wow that seems cool
That's what I think! But I can't find anyone else interested in real time ai or ai vtubing
Yeah use pods for now...
24fps, isn't that so high, you're gonna use bunch of high end gpus hahah
it all runs on one 3090
Wow 24 image / s?
Yup. sdxl turbo https://github.com/GenDJ/GenDJ/blob/main/diffusion_processor.py
GitHub
GenDJ/diffusion_processor.py at main · GenDJ/GenDJ
Contribute to GenDJ/GenDJ development by creating an account on GitHub.
Oooo
Buut you can initiate websockets to external servers from serverless.. if im not wrong
is there any off the shelf or open source solution for managing a bunch of pods? Ideally I'd like to have some sitting idle for users so they can instantly start using it without waiting a few minutes for the server to stand up and for the start script to run setting up the project etc
well there are some scripts that can help you get started, others are like maybe pulumi, skypilot
how? That would be amazing. I'm also serving a web server hosting the frontend from the pod but I can probably figure out how to decouple that from the websocket and processing
kind of insfrastructure as code
ah so it's in that land
use the graphql api if you want to
start script i think you can make custom templates
the pod version of this is already a custom template. Works amazingly well
Woop, no direct connections to the frontend imo, make a backend for this
well it's handy for the pod version
I think the non-pod version will have to be a pretty different architecture with a whole webapp
i think for "instant use or faster loading" use some active workers serverless or pods (longer loading if you wanna match the serverless thing )
yes unless you got the same architecture in pods and serverless lol
yeah that's why I was hoping to have some solution out there for managing the fleet of pods. Obviously for cost reasons I wanna have a few active waiting for people as possible, so it'd have to be dynamic and pretty complex logic of when to stand new ones up/tear old ones down
make your own hahah, use the graphql api
side note currently I'm stuffing the actual models inside of the container which causes my docker images to be like 20 gigs, takes forever to upload and is a terrible developer experience. How are people doing it?
why graphql api instead of python sdk?
I was gonna use this https://github.com/runpod/runpod-python
well there is an alternative, you can offload the model into network storage
can I auto mount a network drive on all new pods I stand up?
yes, its automatically mounted in /workspace
if you choose it in ui or start from graphql, both can be done
but you're limited to a region that has the network volume
ahh dealbreaker
dw tho its still a huge pool of gpus, just need to pick the right one that has the gpu model you wanna use
but also its slower than container disk
well I've noticed some regions aren't good for maintaining the websocket connection. It keeps dropping. and also regions run out of gpus quite frequently in my experience. I can't couple myself to one region, especially not when the current images I have are working fine just huge and annoying
tradeoff not worth it
wish there were more gpus in usa regions
which regions?
most lol
RO is working fine
ooof
and us?
never had the gpus I want available in US
ever lol
yeah well, gpus are limited from the stock in the DCs
DCs?
datacenters
do you work at runpod? you're helpful btw tks for answering my random questions
I've felt like I'm wandering through a dark forest lol
hahaha well not officially working
feel free to explore the docs, or find some codes on google, github
so I guess my next step is to build the whole dang pod management thing
also if you're building for serverless you can test out on pods too
sure goodluck with that!
but geez the serverless stuff all works great I just would love to use that with the ports exposed
spin up an job when a user wants to do the live thing, spin it down when theyre done
why not initiate it from internal to external
that seems more fit to pods workload
not sure what you mean
like i said before initiate the ws from the serverless worker, not expose a port then connect from external
I think you need the port exposed to make a websocket connection no?
hold on spinning up a pod without the ws port open to confirm
Yes, but instead expose the port in your server and connect it from runpod
runpod's firewall for outbound is all exposed if im not wrong
thats not possible if you're connecting from your server to runpod, but works if the other way and you got open ports on your own serv
oh snap
well nah it needs to go both ways
well wait no that might work
im gonna quick spin something up locally and ngrok into it and see if that works
Yeah that might work, try it out on pods if you want
ur a genius
the only annoying thing is now I need to maintain double the websocket connections
double?
browser -><- my server -><- runpod
instead of just sending the frames directly between the runpod instance and the browser
in fact the server -><- runpod connection maybe I want to find a better way to do it than websockets. Some messaging queue or something
sure
but that would add some more delay ig
if not sockets
the second one, Hmm that would work on pods yeah but not really secure if you have more ports exposed or security flaws
well that's how it works currently
wanna try it out lol
hmm sure
dming the link 1 sec
With GenDJ would it be possible to have it process video from a file rather than live? Take in mp4 return mp4?
it would totally be possible, just haven't built it yet
currently it just turns the frames of your webcam into jpgs and sends them to the server, you'd do the same thing with the video as it plays
I'll take a look at the source, might be able to adapt.
PRs welcome lol
Also, for <- my server -><- runpod -> you are not limited to websocket. If it were me I would have servless connect to my_server via a openvpn. Really easy to setup and automate.
interesting. I tried cloudflare tunnel but couldnt figure it out. hadn't thought of vpn
would that scale? Like many people could be using it at the same time?
and how would I send the frames?
Assuming you don't care about hyper focusing on encryption there is very little overhead. You could send frames via SCP or you could mount a disk from the server... If you want to keep it off disk then your could receive it via a webhook.. really anyway you can on network.
With that said I am in the process of building out a front end that acts as web socker server and client... sometimes it is the best way to go.
tbh might want to flip back to the original i2i-realtime and use zmq https://github.com/kylemcdonald/i2i-realtime/blob/main/worker_app.py
Yes, combine that with openvpn connection and your in business.
RunPod has teased about adding backend networks that serverless can connect to and along with your external servers. Until that happens OpenVPN is is pretty close to that.
quick question: do we pay for pod startup time?
pods, i guess yes you pay for them