How to deal with multiple models?
Anyone has a good deployment flow for deploying severless endpoints with multiple large models? Asking because building and pushing a docker image with the model weights takes forever.
1 Reply
Use network storage, store big large models in network storage, access them in pod or serverless