Exposing http ports on serverless

There's no way to expose http ports on serverless is there? When I'm creating a new template and flip the template type from Pod to Serverless that option goes away.
66 Replies
MrAssisted
MrAssisted5d ago
I see stuff in the codebase about RUNPOD_REALTIME_PORT but I'm not sure what the use case is for that and I haven't found documentation for it. Also I'd like to expose two ports to my serverless instances
digigoblin
digigoblin5d ago
Why would you want to expose HTTP ports on serverless? Its not designed for that. Use pods instead.
MrAssisted
MrAssisted5d ago
I'm creating a version of this https://www.instagram.com/p/C8CzGuOubKp/ that anybody can use without setup. Sending frames over a websocket connection at 24fps. I have it working with pods, but I'd rather not manage standing up and tearing down the instances for each user
Instagram
nerdylive
nerdylive5d ago
wow that seems cool
MrAssisted
MrAssisted5d ago
That's what I think! But I can't find anyone else interested in real time ai or ai vtubing
nerdylive
nerdylive5d ago
Yeah use pods for now... 24fps, isn't that so high, you're gonna use bunch of high end gpus hahah
MrAssisted
MrAssisted5d ago
it all runs on one 3090
nerdylive
nerdylive5d ago
Wow 24 image / s?
MrAssisted
MrAssisted5d ago
GitHub
GenDJ/diffusion_processor.py at main · GenDJ/GenDJ
Contribute to GenDJ/GenDJ development by creating an account on GitHub.
nerdylive
nerdylive5d ago
Oooo Buut you can initiate websockets to external servers from serverless.. if im not wrong
MrAssisted
MrAssisted5d ago
is there any off the shelf or open source solution for managing a bunch of pods? Ideally I'd like to have some sitting idle for users so they can instantly start using it without waiting a few minutes for the server to stand up and for the start script to run setting up the project etc
nerdylive
nerdylive5d ago
well there are some scripts that can help you get started, others are like maybe pulumi, skypilot
MrAssisted
MrAssisted5d ago
how? That would be amazing. I'm also serving a web server hosting the frontend from the pod but I can probably figure out how to decouple that from the websocket and processing
nerdylive
nerdylive5d ago
kind of insfrastructure as code
MrAssisted
MrAssisted5d ago
ah so it's in that land
nerdylive
nerdylive5d ago
use the graphql api if you want to start script i think you can make custom templates
MrAssisted
MrAssisted5d ago
the pod version of this is already a custom template. Works amazingly well
nerdylive
nerdylive5d ago
Woop, no direct connections to the frontend imo, make a backend for this
MrAssisted
MrAssisted5d ago
well it's handy for the pod version I think the non-pod version will have to be a pretty different architecture with a whole webapp
nerdylive
nerdylive5d ago
i think for "instant use or faster loading" use some active workers serverless or pods (longer loading if you wanna match the serverless thing ) yes unless you got the same architecture in pods and serverless lol
MrAssisted
MrAssisted5d ago
yeah that's why I was hoping to have some solution out there for managing the fleet of pods. Obviously for cost reasons I wanna have a few active waiting for people as possible, so it'd have to be dynamic and pretty complex logic of when to stand new ones up/tear old ones down
nerdylive
nerdylive5d ago
make your own hahah, use the graphql api
MrAssisted
MrAssisted5d ago
side note currently I'm stuffing the actual models inside of the container which causes my docker images to be like 20 gigs, takes forever to upload and is a terrible developer experience. How are people doing it? why graphql api instead of python sdk? I was gonna use this https://github.com/runpod/runpod-python
nerdylive
nerdylive5d ago
well there is an alternative, you can offload the model into network storage
MrAssisted
MrAssisted5d ago
can I auto mount a network drive on all new pods I stand up?
nerdylive
nerdylive5d ago
yes, its automatically mounted in /workspace if you choose it in ui or start from graphql, both can be done but you're limited to a region that has the network volume
MrAssisted
MrAssisted5d ago
ahh dealbreaker
nerdylive
nerdylive5d ago
dw tho its still a huge pool of gpus, just need to pick the right one that has the gpu model you wanna use but also its slower than container disk
MrAssisted
MrAssisted5d ago
well I've noticed some regions aren't good for maintaining the websocket connection. It keeps dropping. and also regions run out of gpus quite frequently in my experience. I can't couple myself to one region, especially not when the current images I have are working fine just huge and annoying tradeoff not worth it wish there were more gpus in usa regions
nerdylive
nerdylive5d ago
which regions?
MrAssisted
MrAssisted5d ago
most lol RO is working fine
nerdylive
nerdylive5d ago
ooof and us?
MrAssisted
MrAssisted5d ago
never had the gpus I want available in US ever lol
nerdylive
nerdylive5d ago
yeah well, gpus are limited from the stock in the DCs
MrAssisted
MrAssisted5d ago
DCs?
nerdylive
nerdylive5d ago
datacenters
MrAssisted
MrAssisted5d ago
do you work at runpod? you're helpful btw tks for answering my random questions I've felt like I'm wandering through a dark forest lol
nerdylive
nerdylive5d ago
hahaha well not officially working feel free to explore the docs, or find some codes on google, github
MrAssisted
MrAssisted5d ago
so I guess my next step is to build the whole dang pod management thing
nerdylive
nerdylive5d ago
also if you're building for serverless you can test out on pods too sure goodluck with that!
MrAssisted
MrAssisted5d ago
but geez the serverless stuff all works great I just would love to use that with the ports exposed spin up an job when a user wants to do the live thing, spin it down when theyre done
nerdylive
nerdylive5d ago
why not initiate it from internal to external that seems more fit to pods workload
MrAssisted
MrAssisted5d ago
not sure what you mean
nerdylive
nerdylive5d ago
like i said before initiate the ws from the serverless worker, not expose a port then connect from external
MrAssisted
MrAssisted5d ago
I think you need the port exposed to make a websocket connection no? hold on spinning up a pod without the ws port open to confirm
nerdylive
nerdylive5d ago
Yes, but instead expose the port in your server and connect it from runpod runpod's firewall for outbound is all exposed if im not wrong thats not possible if you're connecting from your server to runpod, but works if the other way and you got open ports on your own serv
MrAssisted
MrAssisted5d ago
oh snap well nah it needs to go both ways well wait no that might work im gonna quick spin something up locally and ngrok into it and see if that works
nerdylive
nerdylive5d ago
Yeah that might work, try it out on pods if you want
MrAssisted
MrAssisted5d ago
ur a genius the only annoying thing is now I need to maintain double the websocket connections
nerdylive
nerdylive5d ago
double?
MrAssisted
MrAssisted5d ago
browser -><- my server -><- runpod instead of just sending the frames directly between the runpod instance and the browser in fact the server -><- runpod connection maybe I want to find a better way to do it than websockets. Some messaging queue or something
nerdylive
nerdylive5d ago
sure but that would add some more delay ig if not sockets the second one, Hmm that would work on pods yeah but not really secure if you have more ports exposed or security flaws
MrAssisted
MrAssisted5d ago
well that's how it works currently wanna try it out lol
nerdylive
nerdylive5d ago
hmm sure
MrAssisted
MrAssisted5d ago
dming the link 1 sec
Encyrption
Encyrption5d ago
With GenDJ would it be possible to have it process video from a file rather than live? Take in mp4 return mp4?
MrAssisted
MrAssisted5d ago
it would totally be possible, just haven't built it yet currently it just turns the frames of your webcam into jpgs and sends them to the server, you'd do the same thing with the video as it plays
Encyrption
Encyrption5d ago
I'll take a look at the source, might be able to adapt.
MrAssisted
MrAssisted5d ago
PRs welcome lol
Encyrption
Encyrption5d ago
Also, for <- my server -><- runpod -> you are not limited to websocket. If it were me I would have servless connect to my_server via a openvpn. Really easy to setup and automate.
MrAssisted
MrAssisted5d ago
interesting. I tried cloudflare tunnel but couldnt figure it out. hadn't thought of vpn would that scale? Like many people could be using it at the same time? and how would I send the frames?
Encyrption
Encyrption5d ago
Assuming you don't care about hyper focusing on encryption there is very little overhead. You could send frames via SCP or you could mount a disk from the server... If you want to keep it off disk then your could receive it via a webhook.. really anyway you can on network. With that said I am in the process of building out a front end that acts as web socker server and client... sometimes it is the best way to go.
MrAssisted
MrAssisted5d ago
tbh might want to flip back to the original i2i-realtime and use zmq https://github.com/kylemcdonald/i2i-realtime/blob/main/worker_app.py
Encyrption
Encyrption5d ago
Yes, combine that with openvpn connection and your in business. RunPod has teased about adding backend networks that serverless can connect to and along with your external servers. Until that happens OpenVPN is is pretty close to that.
MrAssisted
MrAssisted5d ago
quick question: do we pay for pod startup time?
nerdylive
nerdylive5d ago
pods, i guess yes you pay for them