Worker configuration for serverless

Hello, When I edit my endpoint, I choose a configuration and there are numbers displayed. What is it, is it a priority? Like try first with config 1, if error or unavailable then config 2 etc ? Also, it seems when I edit it doesn't take the modification into account. I unselected a GPU but it's always running with this, so I was wondering if the edit is broken and I should create a new endpoint?
No description
4 Replies
ashleyk
ashleyk5mo ago
You have to set workers down to zero and back up again if you make changes to the priorities.
justin
justin5mo ago
@PatrickR maybe something to add to docs? I actually talked about this to JM before, and I was confused what the numbers meant too at first. But yeah, it's @Pauline_Cx priorities of what GPUs to spawn / prioritize / if they get switched out due to throttling or something. editing them doesn't do anything till workers get rotated in due to throttling, or if you set max to 0, and then back up to 3 or something, to forcefully reset all ur workers
PatrickR
PatrickR4mo ago
I have this documented here: https://docs.runpod.io/serverless/endpoints/manage-endpoints#edit-an-endpoint Will work on presenting that information in a better place. Still working on the documentation infrastructure arch.
Manage Endpoints | RunPod Documentation
You can create, delete, and edit Endpoints.
justin
justin4mo ago
Sounds great! Yeah, I guess my feedback on it is more like: 1) What does the numbers when selecting GPUs mean cause people aren't too sure on that, and was something even I could only infer, but not necessarily sure, cause if I try to edit it nothing happens. Which leads me to the second point: 2) When editing it, people might not be sure what force a configuration update mean from the documentation, since they don't have a context on #1 that the numbers are priorities for the GPUs for workers, and how that rotation system works. Also does configuration update also apply to editing env variable for templates? and so on.