Raqqa
Raqqa
RRunPod
Created by Raqqa on 5/1/2024 in #⚡|serverless
Efficient way to load the model
I'm migrating my service to RunPod and I need some advice on the best way to handle a 200MB model. Currently, I'm loading the model in the handler like this:
model_path = "src/model.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load(model_path, map_location=device)
model.eval()
model_path = "src/model.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load(model_path, map_location=device)
model.eval()
` It be used to remove text from images, so it will receive both image and mask and return the inpainted resutl, I've never used hosted gpu service, i was using the model on a GCP without gpu so the model was taking like 10-20 seconds to each request using the cpu, im hoping to get things way quicker, Any tips are welcome, both to the loading model or overall for the use case.
10 replies