ZooE1
RRunPod
•Created by ZooE1 on 5/22/2024 in #⚡|serverless
Unstable speed of processing between different wroker.
Hi! I'm deploying serverless for model SadTalker on endpoint with specs 24GB GPU Pro. And I tested some requests and realized that amount of processing time with the same request on different workers are huge difference. Here are 2 log files:
1 - Log of slower worker: it take 45s executionTime. spead of iteration is 2.09s/it at Face Render process
2- Log of normal worker: it take 21s executionTime. Speed of iteration is approximate 1.30 it/s at Face Render process
My endpoints is:schx1xwzhn1lhk
Could anyone help me to debug and prevent this issue?
1 replies
RRunPod
•Created by ZooE1 on 2/27/2024 in #⚡|serverless
How to keep worker memory after completing request?
Hi! I'm running serverless for model GAN. I want preload model in memory at the first request and reuse it on the next req without load model again (in case container/pod still remain). When I sent 2nd req, Idle had "clean up worker" and load model again.
How could I prevent "clean up worker" and keep model in memory? (in case container was not removed)
9 replies