R
RunPod10mo ago
kdcd

Directing requests from the same user to the same worker

Guys, thank you for your work. We are enjoying your platform. I have the following workflow. On the first request from the user, the worker does some hard stuff about 15-20s, caches hard stuff and all subsequent requests are very fast ~150ms. But if some of the subsequent requests goes to another worker, it should repeat this hard stuff again (15-20s). Is there any possibility to direct all the subsequent calls from the same user to the same worker?
Solution:
Just a summary so I can mark this solution: 1) Can use network storage to persist data in between runs 2) Use a outside file storage / object storage provider 3) If using Google cloud / S3 Bucket, for large files can use parallel downloads / uploads; there should be existing tooling out there; or can obvs custom make ur own...
Jump to solution
18 Replies
ashleyk
ashleyk10mo ago
You only really benefit from FlashBoot if you have a constant flow of requests. Otherwise you can either set an Active worker or increase the idle timeout.
flash-singh
flash-singh10mo ago
@kdcd you can use request count scaling, and do something like first 100 requests you only need 1 worker, etc
kdcd
kdcdOP10mo ago
It seems I have introduced a bit of confusion with my explanation of workflow. I will expand on it. My model is working on rendered construction drawings pdf. When user makes some request, pdf downloads from s3 and then renders high quality image, depending on pdf can take ~ 5s-30s. Each user has there own pdf. On subsequent request if request arrives to the same worker, hard work (downloading, rendering) already done, only model evaluates which is fast (150 ms). But if request arrives to another worker, it should download and render everything again. If we are scaling our workers to 10-20, what we are planning to do, it quite ruin the experience for the user, because on every pdf it will have 10-20 very slow requests.
justin
justin10mo ago
You sound like you want a caching mechanism, your best bet is a network storage All pods on serverless can attach to a network storage, which would allow you to persist data in between workers / runs And then that way all workers have the same backing storage https://docs.runpod.io/serverless/references/endpoint-configurations#select-network-volume Essentially your workflow then should like: 1) Worker gets a job 2) Check network storage for client id > if exists pull existing resources > if not create a new folder 3) Continue with the job from whatever xyz point. 4) Write results if needed to network storage for other workers
kdcd
kdcdOP10mo ago
yep, that's nice, thanks a lot. The only thing it will limit workers to one datacenter
justin
justin10mo ago
I think that's just the cost you need to eat, or you can write to a firebase file storage is what I do and I download it from there B/c of what you said, I actually prefer to use my own storage mechanism, especially if your files' arent insanely big for the final files it sounds like, or the initial resources https://github.com/justinwlin/FirebaseStorageWrapperPython (my personal wrapper lol)
kdcd
kdcdOP10mo ago
🙂 Much appreciated. But would firebase be faster then just uploading files to s3? Never heard about it
justin
justin10mo ago
Firebase is backed by Google Cloud Buckets / it is run by google as a Google service it just an easier wrapper around Google Cloud Buckets, so I like Firebase S3 is also fine I just hate AWS xD Honestly, if I could, Id avoid aws / google buckets if i could 😆, but no better file storage / object storage providers out there
kdcd
kdcdOP10mo ago
🙂 Who loves them ?
justin
justin10mo ago
But yeah, I also just think it's easier for me to have an easy wrapper around Google firebase file storage + they got a nice UI + I get a ton of file storage for free before I need to pay for it So it is great for me, for developing cause I dont need to keep paying AWS ingress/egress costs
kdcd
kdcdOP10mo ago
nice, nice We just already have a lot of infra around s3 😦
justin
justin10mo ago
Haha, then go with S3 But yeah not too bad, and I think cause I work with really long audio / videos with runpod, if your files can be optimized before sending / downloading it (compressing, converting file formats, stripping unnecessary data etc) it can also help with getting things moving faster. but honestly ur files sound small enough where that might not be necessary idk how big ur files are tho
kdcd
kdcdOP10mo ago
it depends, a lot of pdfs quite small 30 mb, but render time can be quite big, Some of them about 500 mb.
justin
justin10mo ago
I see, I think for the bigger ones what I do is for S3 they support range downloads So you can in parallel upload / download files So for large files, that is prob what you want to look into, that is what i did for my larger files
kdcd
kdcdOP10mo ago
Thanks for the help
justin
justin10mo ago
Yup no worries, and if you REALLY want to xD: https://discord.com/channels/912829806415085598/1200525738449846342 You can optimize even further with a concurrent worker hahaha. idk how much GPU u are eating up tho But im my mind a PDF renderer might not be eating up GPU resources all the way - i could be wrong but it something ive been playing with lol, but my video / audio transcriber does eat up a lot of resources so i could only get maybe 2 concurrent things going at once Anyways gl 🙂
kdcd
kdcdOP10mo ago
Good luck for you too 🙂
Solution
justin
justin10mo ago
Just a summary so I can mark this solution: 1) Can use network storage to persist data in between runs 2) Use a outside file storage / object storage provider 3) If using Google cloud / S3 Bucket, for large files can use parallel downloads / uploads; there should be existing tooling out there; or can obvs custom make ur own
Want results from more Discord servers?
Add your server