kdcd
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
Good luck for you too 🙂
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
Thanks for the help
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
it depends, a lot of pdfs quite small 30 mb, but render time can be quite big, Some of them about 500 mb.
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
We just already have a lot of infra around s3 😦
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
nice, nice
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
🙂 Who loves them ?
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
Never heard about it
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
🙂 Much appreciated. But would firebase be faster then just uploading files to s3?
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
yep, that's nice, thanks a lot. The only thing it will limit workers to one datacenter
46 replies
RRunPod
•Created by kdcd on 2/15/2024 in #⚡|serverless
Directing requests from the same user to the same worker
It seems I have introduced a bit of confusion with my explanation of workflow. I will expand on it. My model is working on rendered construction drawings pdf. When user makes some request, pdf downloads from s3 and then renders high quality image, depending on pdf can take ~ 5s-30s. Each user has there own pdf. On subsequent request if request arrives to the same worker, hard work (downloading, rendering) already done, only model evaluates which is fast (150 ms). But if request arrives to another worker, it should download and render everything again. If we are scaling our workers to 10-20, what we are planning to do, it quite ruin the experience for the user, because on every pdf it will have 10-20 very slow requests.
46 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
now it's about 150 ms, it seems pause varies depending on the machine
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
Here is the output
2024-02-03T14:31:22.063018714Z yield mean 147.43 ms, max 163.44 ms
2024-02-03T14:31:22.063076544Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished running generator.", "level": "INFO"}
2024-02-03T14:31:22.301744007Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished.", "level": "INFO"}
2024-02-03T14:31:41.269064500Z yield mean 144.33 ms, max 164.62 ms
2024-02-03T14:31:41.269096620Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished running generator.", "level": "INFO"}
2024-02-03T14:31:41.509818007Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished.", "level": "INFO"}
2024-02-03T14:32:23.582166391Z yield mean 142.08 ms, max 157.96 ms
2024-02-03T14:32:23.582201321Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished running generator.", "level": "INFO"}
2024-02-03T14:32:23.805450836Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished.", "level": "INFO"}
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
And i have deployed it just now with 1.6.0
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
I did simple handler and measured time for how long yield pauses
async def async_generator_handler(job):
count = 30
sum_time = 0
maxtime = 0
for in range(count):
start = time.monotonic()
yield
yield_time = (time.monotonic() - start) * 1000
sum_time += yield_time
max_time = max(max_time, yield_time)
mean_time = sum_time / count
print(f"yield mean {mean_time:.2f} ms, max {max_time:.2f} ms")
runpod.serverless.start({ "handler": async_generator_handler}) # Required.
print(f"yield mean {mean_time:.2f} ms, max {max_time:.2f} ms")
runpod.serverless.start({ "handler": async_generator_handler}) # Required.
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
1.6.0
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
now it's about 50ms, it's nice, but still it's 30% of effective work.
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
will try updating
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
1.3.3
18 replies
RRunPod
•Created by kdcd on 1/31/2024 in #⚡|serverless
Pause on the yield in async handler
Thank you for your swift answer. Overall we are enjoying runpod, pods and serverless. It's nice work you have done.
I am evaluating model in the loop and streaming results to the user. Evaluation takes about 150 ms on average and I don't want to loose too much time.
I understand. Probably I can write results to our queue directly.
I am just curious, what are the source of this delay, it seems that all the runpod does in this case shouldn't take too long.
Also for me it's important that overall loop should take as little time as possible, but when yields pauses loop for 150 ms it doubles the time. Instead of 10s, now it's 20. But additional 150ms delay of delivering msg to the user is not a problem at all. Streaming is just for showing a progress, makes user entertained. And I think if I do something like below, in theory yields now shouldn't slow down the loop. How do you think, will it work ?
def loop(job, queue): search = create_search(job) async for msg in search.run_search_generator(request): queue.put_nowait(msg) def handler(job): queue = asyncio.Queue() task = asyncio.create_task(loop(job, queue)) while True: msg = await queue.get() if msg["status"] != "in_progress": break yield msg
Also for me it's important that overall loop should take as little time as possible, but when yields pauses loop for 150 ms it doubles the time. Instead of 10s, now it's 20. But additional 150ms delay of delivering msg to the user is not a problem at all. Streaming is just for showing a progress, makes user entertained. And I think if I do something like below, in theory yields now shouldn't slow down the loop. How do you think, will it work ?
def loop(job, queue): search = create_search(job) async for msg in search.run_search_generator(request): queue.put_nowait(msg) def handler(job): queue = asyncio.Queue() task = asyncio.create_task(loop(job, queue)) while True: msg = await queue.get() if msg["status"] != "in_progress": break yield msg
18 replies