Created by MokshMalik on 7/29/2024 in #⛅|pods
Training jobs using script
If I just stop my pod and do not remove it, will I still be billed? And once I'll be inside the pod, can I stop it from there? Will the command runpodctl remove pod $RUNPOD_POD_ID work from inside the pod?
32 replies
Created by MokshMalik on 7/29/2024 in #⛅|pods
Training jobs using script
Okay, thanks!
32 replies
Created by MokshMalik on 7/29/2024 in #⛅|pods
Training jobs using script
Can you please shed some light on how to auto-kill the instance?
32 replies
Created by MokshMalik on 7/29/2024 in #⛅|pods
Training jobs using script
A big problem is to auto-kill the pod once the training is complete and saving the model weights before that.
32 replies
Created by MokshMalik on 7/29/2024 in #⛅|pods
Training jobs using script
Well, I'm training different kinds of segmentation models for my tasks, varying from simple U-Net to Attention U-Net, and might also go for transformer-based segmentation models. I'd like to run an instance for each model, so I can compare their performance in as little time as possible.
32 replies
Created by MokshMalik on 7/29/2024 in #⛅|pods
Training jobs using script
Sorry, it is still unclear. Does runpod has a tutorial on training a custom model on a GPU instance? I have tried searching for it, but I have not found any.
32 replies
Created by MokshMalik on 7/29/2024 in #⛅|pods
Training jobs using script
I'm fairly new to RunPod. Can you please point me to a tutorial where a remote training job is run on a pod, the model weights are stored on S3, and the pod automatically kills itself once the training is complete?
32 replies