RunPod•13mo ago

How to keep worker memory after completing request?

Hi! I'm running serverless for model GAN. I want preload model in memory at the first request and reuse it on the next req without load model again (in case container/pod still remain). When I sent 2nd req, Idle had "clean up worker" and load model again. How could I prevent "clean up worker" and keep model in memory? (in case container was not removed)

5 Replies

ashleyk•13mo ago

Load the model before you call runpod.serverless.start() and enable FlashBoot on your endpoint.

ZooE1OP•13mo ago

I changed as your guide and I could use preload model now. But sometime, Worker was cleaned up when I sent second request. I check log and got an error about serverless/worker_loop: File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/runpod/serverless/init.py", line 24, in start asyncio.run(work_loop.start_worker(config)) File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.7/asyncio/runners.py", line 43, in run File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete return future.result() runpod.serverless.start({"handler": handler} File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.7/site-packages/runpod/serverless/work_loop.py", line 36, in start_worker if job["input"] is None: KeyError: 'input' Is is a bug of package runpod-python, isn't it? Because my request has "input" field.

ashleyk•13mo ago

Which version of the SDK are you using? I haven't had any issues like this. The worker will be cleaned on 2nd and subsequent requests if you aren't sending a constant flow of requests. FlashBoot is only benecicial if you send a constant flow of requests to your endpoint.

ZooE1OP•13mo ago

I'm using SDK 0.9.9. Cause my project requires python 3.7.

ashleyk•13mo ago

Why does your project require such an ancient version of Python? You are going to run into a world of pain using such an ancient version of the SDK.

Gaming

Programming

How to keep worker memory after completing request?

Did you find this page helpful?