Serverless Real-World Billing (Cold Start, Execution, Idle)
I understand that RunPod Serverless compute is billed as:
Cold Start Time + Execution Time + Idle Timeout
Can you help clarify how this applies in real-world settings with sporadic usage? For an example:
- Docker Image: Stable Diffusion XL model
- Image Spin-Up Time: ?20? seconds
- Execution Time: 5 seconds per request
Could you explain the billing in each of the following scenarios to understand how the spin-up, idle, and caching times are applied?
1. One request is sent every 10 seconds for an hour (30 minutes total execution time)
2. One request is sent every minute for an hour (5 minutes total execution time)
3. One request is sent every 10 minutes for an hour (0.5 minutes total execution time)
What would be the total serverless compute-time billed in these cases? For example -- Would a full hour be charged due to a keep-alive time? ... Or would it be true "serverless" paying only the compute, and caching results in a reduced spin-up time after the first usage?
Thank you in advance.
1 Reply
You’ll need to test it to get an accurate cost. Typically, if your traffic is more sporadic or has longer gaps between requests, the cost per request will be higher due to more cold starts and idle time. On the other hand, with high traffic, most of the cost will be directed towards execution time