Using network volume with serverless

I am running a stateless model within serverless to modify provided image. I am wondering if the network volume could be used instead of s3 to upload input and output files? Have somebody done anything similar? Could you share your experience and thoughts? PS Maybe somebody has implemented a tricky solution to improve the double upload/download performance? Currently I am S3 bucket for this, but I feel like there might be a better solution
21 Replies
justin
justin8mo ago
You can, the network volume when mounted for serverless is: runpod-volume, I believe. The only thing with network volume is that you can get restricted into a region and lose avaliability. I think an S3 bucket is much better, or a Firebase file storage/ GCP bucket. If the speed is too slow, you can do a concurrent download maybe? S3 allows what is called range downloads So I've experimented with this for GCP buckets / S3, where you can do a parallel download on a single large file is maybe what you are looking for?
agentpietrucha
agentpietruchaOP8mo ago
I send a lot of small files (from 1 to 5 MB each). But I send them frequently, let's say 1000/hour. In your opinion the parallel download/upload would be suitable for such case? What do you think? Or maybe little overkill?
nerdylive
nerdylive8mo ago
I think network storage would be more suitable since less latency I guess? Since They are in the same dc or region
justin
justin8mo ago
Network storage, I think has slow I/O operations in general. Their block storage solution isn't that good, restricts the types of GPUs that are enabled to use the network storage + restriction to a region is pretty bad. I've seen people's serverless endpoints get completely throttled cause an entire region goes down or there is a large user that ate up all the GPU in the region. So is better to always decentralize the data off imo. HMMMM. I'm not too sure that a pretty unique case. If one worker is only downloading one file, wouldn't make a difference, if you are downloading a bunch of files, then you can parallel download a bunch of files at once - that would make a difference You could put a redis cache like upstash in front of everything potentially, if there is any sort-of patterns Wonder if you could use something like https://turso.tech/ or https://cockroachlabs.cloud/ too (1mb to 5mb maybe too much, but wonder do you really need to pull in all that data?). I personally use decentralized databases when I can > and only if I have to use Firebase file storage. I'm not at such a file speed download constraints, where I've had to optimize it for runpod but for other services / projects , I use Upstash as my redis caching mechanism
nerdylive
nerdylive8mo ago
Yeah considering that limitations too you should create multiple regions and multiple net storage if yo want Is it really that bad on io oper
justin
justin8mo ago
Yeah, I think it's a pain though cause then you need to somehow sync all the status of all your network regions / serverless endpoint status. Hopefully runpod just comes out with their multiregion offering feature soon 😢
nerdylive
nerdylive8mo ago
Sqlite but for prod?😮😮
justin
justin8mo ago
There was a discussion about this previously under #🧐|feedback for multiregion support if curious xD yeah, turso is pretty good very fast
nerdylive
nerdylive8mo ago
Oo that's great I hope they implement that too
justin
justin8mo ago
I use cockroach tho Kinda interesting having to set up everything tho 😢, took me a bit to set up my dockerfile to work with cockroach / turso, but once u get it working, is a nice base to work off of. Definitely more and more companies, looking at the whole decentralized database approach in the last couple of years
agentpietrucha
agentpietruchaOP8mo ago
Thanks guys for your help and suggestions and opinion on runpod network storage ✌🏻. I guess I will stick with standard s3 for my case. I’m handling images, so db won’t be suitable for me I guess
justin
justin8mo ago
How good do ur images need to be haha? Jpeg maybe? xD webp?
agentpietrucha
agentpietruchaOP8mo ago
Input is either jpeg or png. Output is png
justin
justin8mo ago
Got it~ Yeah, idk the use case, but could potentially do like a webp 🤔 and do a live conversion to .png if the end product needs to be that you'll prob save a lot of space But interesting~ gl gl ~ But I guess less file conversions also less headache haha
agentpietrucha
agentpietruchaOP8mo ago
Yeah, I’ve been thinking about webp conversion. But haven’t done anything about it yet. Problem is within the details. Main app which makes use of runpod is desktop app, which may be run on different pc specs. So file conversion could be even worse than uploading extra MB
justin
justin8mo ago
I don't think is too bad tbh. I do an automatic yt video generation project right now, and I just have a fly.io endpoint that just does a bunch of conversions for me, spins up and down all the time, and super easy to manage Yeah the details definitely awlays the problem haha I don't think that webp is that bad tho. Even if the end customer got a "webp" image, they prob just go and manually convert it to whatever else worst comes to worst Idk, I think having just dealt with a lot of images, webp has been a life saver. I just move everything to webp now cause of storage + functionality, and if needbe, you can go from lossy to lossless with it, so it's super nice to work off of as a base But yeah~ maybe an over optimization too early xD maybe just keep going with ur solution right now till u hit a bottleneck
agentpietrucha
agentpietruchaOP8mo ago
You’re right probably. You reminded me about the webp conversion. I will give it a second thought. Thanks!
flash-singh
flash-singh8mo ago
network storage is best, then use async cpu pod to sync to cloud like s3, internet speed and latency is much slower compared to network storage
nerdylive
nerdylive8mo ago
ah agreed system design 101 😄
justin
justin8mo ago
idk, i still hate region restrictions lol + there becomes a problem if the region is throttled / if you ever need to migrate. I guess depends what you are aiming for. Im okay trading time for avaliability
flash-singh
flash-singh8mo ago
thats the biggest downside to network storage
Want results from more Discord servers?
Add your server