R
RunPod7mo ago
Harish

Increase Spot Warning Time

I see in the docs that there is a 5s window before a spot instance is interrupted. 5s isn't really enough time to save or do anything - e.g. AWS has a 2 minute warning. Even if 2 minutes is too much, it would be huge if we could get 1m or even 30s of a warning, so that we don't need to check so often.
13 Replies
digigoblin
digigoblin7mo ago
AWS does not have HUGE demand for CPU instances like RunPod has for GPUs so you can't compare. 5s is fine, moral of the story is to not use spot instances. GPU cost on RunPod is lower than pretty much anywhere else so there is no reason to use spot instances.
dxqbYD
dxqbYD7mo ago
Q: feature doesn't work, can we improve it? A: don't use feature
nerdylive
nerdylive7mo ago
check for signals in linux, and maybe trigger transfer files after @haris @Madiator2011 (Work) any idea how to handle this?
digigoblin
digigoblin7mo ago
Don't know why you're tagging all these people, this is the way it works, people can either accept it or don't accept it, end of story in my opinion. There is 0% chance of spending time and effort on e feature to reduce income. Using spot instances for a GPU while in an AI bubble is the most stupid thing a person can do.
nerdylive
nerdylive7mo ago
is that a question?
digigoblin
digigoblin7mo ago
I don't get why people who want to use stupid features always complain about it, just don't use that feature. I think RunPod should just disable spot instances entirely, it makes no sense and causes a lot of unnecessary support by cheapskates.
nerdylive
nerdylive7mo ago
alright alright It’s important to respect that some features may be useful to others, even if we don’t find them necessary ourselves. Let’s keep our discussions considerate and polite
digigoblin
digigoblin7mo ago
Thats fine but stop wasting peoples time with unnecessary support request, either accept the pros and cons of the feature or STOP USING IT. Simple as that. Don't whine about how it works or expect changes because it aint gonna happen. Don't compare RunPod to AWS either, its not AWS.
dxqbYD
dxqbYD7mo ago
if this reply is in any way connected to runpod: don't offer a feature if you don't want to receive support or feature requests about it. in 5 secs you can maybe save 500 mb to a storage volume at runpod. it's a reasonable request to be able to save your training state in vram before being kicked off.
haris
haris7mo ago
digi is not associated with RunPod, he is a part of our community
dxqbYD
dxqbYD7mo ago
ok
haris
haris7mo ago
I've already written down this piece of feedback and it'll be looked over by the leadership team on Tuesday There's no garuntee that we'll change the 5s window, but we are aware internally.
Aurora
Aurora7mo ago
you can write a daemon that automaticly saves work when the signal detected
Want results from more Discord servers?
Add your server