R
RunPod•8mo ago
Raqqa

Efficient way to load the model

I'm migrating my service to RunPod and I need some advice on the best way to handle a 200MB model. Currently, I'm loading the model in the handler like this:
model_path = "src/model.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load(model_path, map_location=device)
model.eval()
model_path = "src/model.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load(model_path, map_location=device)
model.eval()
` It be used to remove text from images, so it will receive both image and mask and return the inpainted resutl, I've never used hosted gpu service, i was using the model on a GCP without gpu so the model was taking like 10-20 seconds to each request using the cpu, im hoping to get things way quicker, Any tips are welcome, both to the loading model or overall for the use case.
Solution:
Hi, you should start your model outside of the handler
Jump to solution
8 Replies
Madiator2011
Madiator2011•8mo ago
200 MB models should load fast just make sure you embed it into docker image so it soes not need to download from cloud
Raqqa
RaqqaOP•8mo ago
Great! Thanks 🔥
agentpietrucha
agentpietrucha•8mo ago
You should also load your model outside of your handler function. Here it is mentioned: https://arc.net/l/quote/jvkbeogj. Then you won't be loading your model again and again on every new request. You will only load the model once the worker starts. Doing this helped me speed up my worker a looot Something like this: model = Your_Model() runpod.serverless.start({"handler": handler(model)})
nayandhabarde
nayandhabarde•7mo ago
@agentpietrucha I did something like model = Model("realvis.safetensors") fun generate(event): .... runpod.serverless.start({"handler": generate}) it loads the model again and again at every request and 2024-06-06T09:29:54Z start container 2024-06-06T09:30:37Z stop container 2024-06-06T09:30:38Z remove container 2024-06-06T09:30:38Z remove network this also happens then how it would keep the model loaded
nayandhabarde
nayandhabarde•7mo ago
i get these logs everytime so it should mean it is loading again and again right?
No description
Madiator2011
Madiator2011•7mo ago
it looks like it downloads still some files
agentpietrucha
agentpietrucha•7mo ago
As Papa mentioned, it looks like your worker is downloading/loading some files. @nayandhabarde your logs (start container, stop container...) are typical logs of starting a container When I gave your screenshot a second look it seems to me that you may have some bug in your implementation. Share your code, or better fragment if possible. Only then I/someone else will be able to help you better
Solution
Alpay Ariyak
Alpay Ariyak•7mo ago
Hi, you should start your model outside of the handler
Want results from more Discord servers?
Add your server