Increase Spot Warning Time
I see in the docs that there is a 5s window before a spot instance is interrupted. 5s isn't really enough time to save or do anything - e.g. AWS has a 2 minute warning. Even if 2 minutes is too much, it would be huge if we could get 1m or even 30s of a warning, so that we don't need to check so often.
13 Replies
AWS does not have HUGE demand for CPU instances like RunPod has for GPUs so you can't compare. 5s is fine, moral of the story is to not use spot instances. GPU cost on RunPod is lower than pretty much anywhere else so there is no reason to use spot instances.
Q: feature doesn't work, can we improve it? A: don't use feature
check for signals in linux, and maybe trigger transfer files after
@haris
@Madiator2011 (Work) any idea how to handle this?
Don't know why you're tagging all these people, this is the way it works, people can either accept it or don't accept it, end of story in my opinion. There is 0% chance of spending time and effort on e feature to reduce income.
Using spot instances for a GPU while in an AI bubble is the most stupid thing a person can do.
is that a question?
I don't get why people who want to use stupid features always complain about it, just don't use that feature.
I think RunPod should just disable spot instances entirely, it makes no sense and causes a lot of unnecessary support by cheapskates.
alright alright It’s important to respect that some features may be useful to others, even if we don’t find them necessary ourselves. Let’s keep our discussions considerate and polite
Thats fine but stop wasting peoples time with unnecessary support request, either accept the pros and cons of the feature or STOP USING IT. Simple as that. Don't whine about how it works or expect changes because it aint gonna happen.
Don't compare RunPod to AWS either, its not AWS.
if this reply is in any way connected to runpod: don't offer a feature if you don't want to receive support or feature requests about it. in 5 secs you can maybe save 500 mb to a storage volume at runpod. it's a reasonable request to be able to save your training state in vram before being kicked off.
digi is not associated with RunPod, he is a part of our community
ok
I've already written down this piece of feedback and it'll be looked over by the leadership team on Tuesday
There's no garuntee that we'll change the 5s window, but we are aware internally.
you can write a daemon that automaticly saves work when the signal detected