zaid
RRunPod
•Created by zaid on 2/22/2025 in #⚡|serverless
do we get billed partially or rounded up to the second?
If my execution time is 0.35 seconds, will I get billed 1 second for that request or partially?
4 replies
RRunPod
•Created by zaid on 2/22/2025 in #⚡|serverless
Max workers increase
Hi we are planning a production launch, currently using serverless setup. We see max workers is 5 right now, and if we have a balance of 100 we can increase to 10. I want to understand what is the process of increasing lets say to 20 or 100 in the future?
2 replies
RRunPod
•Created by zaid on 2/11/2025 in #⚡|serverless
What is expected continuous delivery (CD) setup for serverless endpoints for private models?
Hello, our model artificats are stored in S3, what is the continuous delivery setup for serverless models not hosted on dockerhub?
What I have seen so far:
- Existing runpod workers download publicly available models and push them to dockerhub
- Github repo connection in Serverless setup, I'm not sure how I will pass my AWS credentials during the runpod managed build to download the model
- Network volumes that can be attached to serverless. I see cloudsync works runpod -> S3 not the other way around. How would we programmatically update this volume and refresh our model?
- No native auth integration with AWS ECR, credentials expire in 12 hours, which effects reloading containers
What I have setup right now:
- Github action builds and uploads AWS ECR image
- Manually update credentials
- change tag version for new release (i'm okay with doing this manually if I have to for now)
7 replies