delirious
GPU requires reset
Restarted and re-created the pod a couple times, getting the same error on container start. I assume it keeps grabbing the same bad node. I was able to start the container by switching to a different instance type.
2024-08-26T21:15:45Z error creating container: nvidia-smi: parsing output of line 5: failed to parse ([GPU requires reset]) into int: strconv.Atoi: parsing "": invalid syntax
Pod ID: 2hvpqmtrowunjp
5 replies