Martin Comments - Answer Overflow

Martin

•Created by Martin on 3/18/2024 in #⚡｜serverless

How to load model into memory before the first run of a pod?

Then how can you explain the first request hitting the worker is taking much more time than the next ones, even after having the worker down for some time? What I would expect is that on the boot of the worker: - image is loaded - first part of the handler runs (loading my model) So then when a request is hitting the worker for the first time it will be as quick as the next times.

11 replies

RRunPod

•Created by Martin on 3/18/2024 in #⚡｜serverless

How to load model into memory before the first run of a pod?

What is flashboot doing? Is it running this part ahead? Why is it not running it when I have a flow that is not constant?

11 replies

Gaming

Programming