Efficient way to load the model
I'm migrating my service to RunPod and I need some advice on the best way to handle a 200MB model.
Currently, I'm loading the model in the handler like this:
`
It be used to remove text from images, so it will receive both image and mask and return the inpainted resutl,
I've never used hosted gpu service, i was using the model on a GCP without gpu so the model was taking like 10-20 seconds to each request using the cpu, im hoping to get things way quicker,
Any tips are welcome, both to the loading model or overall for the use case.
8 Replies
200 MB models should load fast just make sure you embed it into docker image so it soes not need to download from cloud
Great! Thanks 🔥
You should also load your model outside of your handler function. Here it is mentioned: https://arc.net/l/quote/jvkbeogj. Then you won't be loading your model again and again on every new request. You will only load the model once the worker starts. Doing this helped me speed up my worker a looot
Something like this:
model = Your_Model()
runpod.serverless.start({"handler": handler(model)})
@agentpietrucha
I did something like
model = Model("realvis.safetensors")
fun generate(event):
....
runpod.serverless.start({"handler": generate})
it loads the model again and again at every request and
2024-06-06T09:29:54Z start container
2024-06-06T09:30:37Z stop container
2024-06-06T09:30:38Z remove container
2024-06-06T09:30:38Z remove network
this also happens then how it would keep the model loaded
i get these logs everytime so it should mean it is loading again and again right?
it looks like it downloads still some files
As Papa mentioned, it looks like your worker is downloading/loading some files. @nayandhabarde your logs (start container, stop container...) are typical logs of starting a container
When I gave your screenshot a second look it seems to me that you may have some bug in your implementation. Share your code, or better fragment if possible. Only then I/someone else will be able to help you better
Solution
Hi, you should start your model outside of the handler