What is expected continuous delivery (CD) setup for serverless endpoints for private models?
Hello, our model artificats are stored in S3, what is the continuous delivery setup for serverless models not hosted on dockerhub?
What I have seen so far:
- Existing runpod workers download publicly available models and push them to dockerhub
- Github repo connection in Serverless setup, I'm not sure how I will pass my AWS credentials during the runpod managed build to download the model
- Network volumes that can be attached to serverless. I see cloudsync works runpod -> S3 not the other way around. How would we programmatically update this volume and refresh our model?
- No native auth integration with AWS ECR, credentials expire in 12 hours, which effects reloading containers
What I have setup right now:
- Github action builds and uploads AWS ECR image
- Manually update credentials
- change tag version for new release (i'm okay with doing this manually if I have to for now)
3 Replies
Yeah you can either : download it first into your image then push it into any docker registry you like and use it on runpod, you can pass the creds on settings, though aws ecr has rotating creds you can use cron jobs to execute graphql into your runpod account to update the registry creds
Or using network storage to store your models, you can build your own solution to update your own model, just re-download when something happens. Or etc, you can customize your code then run it on serverless or pods
for a quick win we can get you programmatic way to update creds, along with updating tag version
our long term path is we are introducing
model store
which can pull public and private models from huggingface and store them locally on servers with faster access rather than in network storage; s3 support may be further down the roadSo I just added a cron schedule i.e cloudwatch event -> lambda to update credentials...
would be nice to get good documentation on best practices on deploying private models if you are in AWS setup