kdcd Comments - Answer Overflow

It seems I have introduced a bit of confusion with my explanation of workflow. I will expand on it. My model is working on rendered construction drawings pdf. When user makes some request, pdf downloads from s3 and then renders high quality image, depending on pdf can take ~ 5s-30s. Each user has there own pdf. On subsequent request if request arrives to the same worker, hard work (downloading, rendering) already done, only model evaluates which is fast (150 ms). But if request arrives to another worker, it should download and render everything again. If we are scaling our workers to 10-20, what we are planning to do, it quite ruin the experience for the user, because on every pdf it will have 10-20 very slow requests.

46 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

now it's about 150 ms, it seems pause varies depending on the machine

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

Here is the output 2024-02-03T14:31:22.063018714Z yield mean 147.43 ms, max 163.44 ms 2024-02-03T14:31:22.063076544Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:31:22.301744007Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished.", "level": "INFO"} 2024-02-03T14:31:41.269064500Z yield mean 144.33 ms, max 164.62 ms 2024-02-03T14:31:41.269096620Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:31:41.509818007Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished.", "level": "INFO"} 2024-02-03T14:32:23.582166391Z yield mean 142.08 ms, max 157.96 ms 2024-02-03T14:32:23.582201321Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:32:23.805450836Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished.", "level": "INFO"}

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

And i have deployed it just now with 1.6.0

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

I did simple handler and measured time for how long yield pauses async def async_generator_handler(job): count = 30 sum_time = 0 maxtime = 0 for in range(count): start = time.monotonic() yield yield_time = (time.monotonic() - start) * 1000 sum_time += yield_time max_time = max(max_time, yield_time) mean_time = sum_time / count
print(f"yield mean {mean_time:.2f} ms, max {max_time:.2f} ms")
runpod.serverless.start({ "handler": async_generator_handler}) # Required.

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

1.6.0

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

now it's about 50ms, it's nice, but still it's 30% of effective work.

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

will try updating

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

1.3.3

18 replies

RRunPod

•Created by kdcd on 1/31/2024 in #⚡｜serverless

Pause on the yield in async handler

Thank you for your swift answer. Overall we are enjoying runpod, pods and serverless. It's nice work you have done. I am evaluating model in the loop and streaming results to the user. Evaluation takes about 150 ms on average and I don't want to loose too much time. I understand. Probably I can write results to our queue directly. I am just curious, what are the source of this delay, it seems that all the runpod does in this case shouldn't take too long.
Also for me it's important that overall loop should take as little time as possible, but when yields pauses loop for 150 ms it doubles the time. Instead of 10s, now it's 20. But additional 150ms delay of delivering msg to the user is not a problem at all. Streaming is just for showing a progress, makes user entertained. And I think if I do something like below, in theory yields now shouldn't slow down the loop. How do you think, will it work ?
def loop(job, queue): search = create_search(job) async for msg in search.run_search_generator(request): queue.put_nowait(msg) def handler(job): queue = asyncio.Queue() task = asyncio.create_task(loop(job, queue)) while True: msg = await queue.get() if msg["status"] != "in_progress": break yield msg

18 replies

Gaming

Programming