RunPod•12mo ago

server less capability check

I want to add runpod into a tier of load balanced llm models behind an app like openrouter.ai, but the decision will occur in our infrastructure. When i invoke a server less instance with my app and a task is completed, how am I billed for idle time if the container unloads the model from gpu memory? In other words I want to reduce costs and increase performance by only needing to load the model after an idle timeout, paying only for the small app footprint in storage/memory

Solution:

You are charged for the entire time the container is running including cold start time, execution time and idle timeout.

Jump to solution

4 Replies

Solution

ashleyk•12mo ago

You are charged for the entire time the container is running including cold start time, execution time and idle timeout.

ACiDGRiMOP•12mo ago

I thought so. Do the containers have docker capabilities to create a wireguard interface?

ashleyk•12mo ago

You can't access the underlying docker stuff on the host machine if that's what you're asking

ACiDGRiMOP•12mo ago

I don't mean the docker socket. I mean I want to create a VPN tunnel to my AWS tenant, rather than dealing with pki in the container

Gaming

Programming

server less capability check

Did you find this page helpful?