RunPod•13mo ago

Efficient way to load the model

I'm migrating my service to RunPod and I need some advice on the best way to handle a 200MB model. Currently, I'm loading the model in the handler like this:

model_path = "src/model.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load(model_path, map_location=device)
model.eval()

model_path = "src/model.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load(model_path, map_location=device)
model.eval()

` It be used to remove text from images, so it will receive both image and mask and return the inpainted resutl, I've never used hosted gpu service, i was using the model on a GCP without gpu so the model was taking like 10-20 seconds to each request using the cpu, im hoping to get things way quicker, Any tips are welcome, both to the loading model or overall for the use case.

Solution:

Hi, you should start your model outside of the handler

Jump to solution

8 Replies

Madiator2011•13mo ago

200 MB models should load fast just make sure you embed it into docker image so it soes not need to download from cloud

RaqqaOP•13mo ago

Great! Thanks 🔥

agentpietrucha•13mo ago

You should also load your model outside of your handler function. Here it is mentioned: https://arc.net/l/quote/jvkbeogj. Then you won't be loading your model again and again on every new request. You will only load the model once the worker starts. Doing this helped me speed up my worker a looot Something like this:

model = Your_Model()
runpod.serverless.start({"handler": handler(model)})

nayandhabarde•11mo ago

@agentpietrucha I did something like model = Model("realvis.safetensors") fun generate(event): .... runpod.serverless.start({"handler": generate}) it loads the model again and again at every request and 2024-06-06T09:29:54Z start container 2024-06-06T09:30:37Z stop container 2024-06-06T09:30:38Z remove container 2024-06-06T09:30:38Z remove network this also happens then how it would keep the model loaded

nayandhabarde•11mo ago

i get these logs everytime so it should mean it is loading again and again right?

Madiator2011•11mo ago

it looks like it downloads still some files

agentpietrucha•11mo ago

As Papa mentioned, it looks like your worker is downloading/loading some files. @nayandhabarde your logs (start container, stop container...) are typical logs of starting a container When I gave your screenshot a second look it seems to me that you may have some bug in your implementation. Share your code, or better fragment if possible. Only then I/someone else will be able to help you better

Solution

Alpay Ariyak•11mo ago

Hi, you should start your model outside of the handler

Gaming

Programming

Efficient way to load the model

Did you find this page helpful?