Clion
Connectivity issue on 4090 pod
Alright guys, I redid the training to reproduce the models that were on this pod, so I no longer need what is/was on this storage. But could I get a credit for the compute time that I used to do so? I think this is fair to ask, since hosts making pods with active storage inaccessible without notice is not really something that users are told to expect as a possibility, and I was paying the disk fees to keep that pod's storage active specifically so I could retrieve data from it later
13 replies
Connectivity issue on 4090 pod
Is it possible to just say fk it and take a compute time credit? I've got waiting tasks reliant on the trained models on that pod and I'd guess that the endgame here is going to be "all temporary storage on affected machines is lost" so if I'm gonna have to redo 8-10hrs of compute I'd pref to get started sooner rather than later. Naturally though, decision is up to you guys, I'm not trying to cause anyone a hard time 🙃
13 replies
Connectivity issue on 4090 pod
Any updates on this? Apologies, not trying to rush you, just trying to determine whether I should be spinning up a new pod and redoing the training I did yesterday or if there's actually a chance that this pod's storage can be accessed to pull the models off of it
13 replies