Pause on the yield in async handler
I have wrote async handler. Messages are realy small, about several kilobites
async for msg in search.run_search_generator(request):
start_time = time.perf_counter()
yield msg
print("elapsed_time", (time.perf_counter() - start_time) * 1000)
And I have measured how much time every yield from the job takes and it's about 160 ms. It's quite a lot for my use case and increases time twice for the whole job execution. What are my options ?
10 Replies
what are you trying to do? you can implemnt your own and instead use the workers as just compute, handle your own streaming with a central layer
everything we do makes things simpler but sacrifices performance in some cases, for best low latency, the closer you integrate to worker the better, e.g. avoid api.runpod.ai calls for output and get data directly from your worker to your central service
Thank you for your swift answer. Overall we are enjoying runpod, pods and serverless. It's nice work you have done.
I am evaluating model in the loop and streaming results to the user. Evaluation takes about 150 ms on average and I don't want to loose too much time.
I understand. Probably I can write results to our queue directly.
I am just curious, what are the source of this delay, it seems that all the runpod does in this case shouldn't take too long.
Also for me it's important that overall loop should take as little time as possible, but when yields pauses loop for 150 ms it doubles the time. Instead of 10s, now it's 20. But additional 150ms delay of delivering msg to the user is not a problem at all. Streaming is just for showing a progress, makes user entertained. And I think if I do something like below, in theory yields now shouldn't slow down the loop. How do you think, will it work ?
def loop(job, queue): search = create_search(job) async for msg in search.run_search_generator(request): queue.put_nowait(msg) def handler(job): queue = asyncio.Queue() task = asyncio.create_task(loop(job, queue)) while True: msg = await queue.get() if msg["status"] != "in_progress": break yield msg
Also for me it's important that overall loop should take as little time as possible, but when yields pauses loop for 150 ms it doubles the time. Instead of 10s, now it's 20. But additional 150ms delay of delivering msg to the user is not a problem at all. Streaming is just for showing a progress, makes user entertained. And I think if I do something like below, in theory yields now shouldn't slow down the loop. How do you think, will it work ?
def loop(job, queue): search = create_search(job) async for msg in search.run_search_generator(request): queue.put_nowait(msg) def handler(job): queue = asyncio.Queue() task = asyncio.create_task(loop(job, queue)) while True: msg = await queue.get() if msg["status"] != "in_progress": break yield msg
@Justin Merrell we should look into this, maybe our new rust core is better
What version of the Python SDK are you using?
1.3.3
will try updating
now it's about 50ms, it's nice, but still it's 30% of effective work.
@Justin Merrell where is that time spent?
This is using the latest SDK? Is your repo/test code open source?
1.6.0
And what is the 50ms measuring again?
I did simple handler and measured time for how long yield pauses
async def async_generator_handler(job):
count = 30
sum_time = 0
maxtime = 0
for in range(count):
start = time.monotonic()
yield
yield_time = (time.monotonic() - start) * 1000
sum_time += yield_time
max_time = max(max_time, yield_time)
mean_time = sum_time / count
print(f"yield mean {mean_time:.2f} ms, max {max_time:.2f} ms")
runpod.serverless.start({ "handler": async_generator_handler}) # Required. And i have deployed it just now with 1.6.0 Here is the output 2024-02-03T14:31:22.063018714Z yield mean 147.43 ms, max 163.44 ms 2024-02-03T14:31:22.063076544Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:31:22.301744007Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished.", "level": "INFO"} 2024-02-03T14:31:41.269064500Z yield mean 144.33 ms, max 164.62 ms 2024-02-03T14:31:41.269096620Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:31:41.509818007Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished.", "level": "INFO"} 2024-02-03T14:32:23.582166391Z yield mean 142.08 ms, max 157.96 ms 2024-02-03T14:32:23.582201321Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:32:23.805450836Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished.", "level": "INFO"} now it's about 150 ms, it seems pause varies depending on the machine
print(f"yield mean {mean_time:.2f} ms, max {max_time:.2f} ms")
runpod.serverless.start({ "handler": async_generator_handler}) # Required. And i have deployed it just now with 1.6.0 Here is the output 2024-02-03T14:31:22.063018714Z yield mean 147.43 ms, max 163.44 ms 2024-02-03T14:31:22.063076544Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:31:22.301744007Z {"requestId": "1ed8d0e3-7bed-4edf-b782-673b60a6f42f-u1", "message": "Finished.", "level": "INFO"} 2024-02-03T14:31:41.269064500Z yield mean 144.33 ms, max 164.62 ms 2024-02-03T14:31:41.269096620Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:31:41.509818007Z {"requestId": "e2ecb24a-1784-4863-8770-9ba2d382d928-u1", "message": "Finished.", "level": "INFO"} 2024-02-03T14:32:23.582166391Z yield mean 142.08 ms, max 157.96 ms 2024-02-03T14:32:23.582201321Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished running generator.", "level": "INFO"} 2024-02-03T14:32:23.805450836Z {"requestId": "sync-34216b86-9f68-4cf9-ab87-c16de49a4a83-u1", "message": "Finished.", "level": "INFO"} now it's about 150 ms, it seems pause varies depending on the machine