Riley
Start and stop multiple pods
The problem we see is that to use serverless (since we need at least one worker always active because the time from when we deploy to the time it is ready to accept requests is around 15 minutes) we would need to pay 0.00026/s or .936/hr all the time and then add on the cost of more flex workers if necessary due to high volume which come at a cost of 0.00044/s or 1.54/hour. So it seems it would be prefererable to manage pods in GPU cloud paying 0.74/hr when active and 0.006/hr when inactive.
9 replies