Slow initialization, even with flashboot, counted as execution time
I am running a serverless Fooocus API endpoint from this code base https://github.com/davefojtik/RunPod-Fooocus-API.
It takes a long time to initialize, even with Flashboot, and the initialization counts as execution time. In subsequent runs with Flashboot, the time is dramatically lower, until Flashboot's cache clears.
The issue is raised and discussed here https://github.com/davefojtik/RunPod-Fooocus-API/issues/5
GitHub
startup speed · Issue #5 · davefojtik/RunPod-Fooocus-API
Hi @davefojtik I was trying the further optimize the speed and what I noticed is the docker image takes 40+ seconds in the first startup on a 4090 to start usually, most of your requests are passed...
11 Replies
It said this post needs a tag and I didn't see any other tags other than Solved, but this is not solved
Not sure why you are posting here, this is normal for cold start, continue the dicussion in the Github repo, its more suitable, its not a RunPod issue.
@Polar can you look at fixing the tags please? We shouldn't be forced to tag new posts as closed, they should default to open and only be closed once they are actually solved.
Oh yep, that's my bad
I am cross-posting here because it's a Flashboot issue, as you can see in the conversation. Importantly, the startup time counts as execution time
its not a Flashboot issue, you only benefit from Flashboot if you have a constant flow of requests, its very rare to benefit from it when you don't have many requests.
The issue is that the startup time that gets reduced by Flashboot is counted as execution time
Also, how long is Flashboot cached for before it resets?
Don't load models and things after you call serverless.start() do that before calling serverless.start() then its counted towards cold start time and not execution time.
Endpoint configurations | RunPod Documentation
The following are configurable settings within an Endpoint.
Nobody knows, its "magic" as per the docs.
Not sure why RunPod can't just be transparent about how it works instead of calling it "magic".
It loads before serverless.start
Then it does not count towards execution time, only everything that happens after serverless.start() is counted towards execution time.