CUDA error in community 4090x4 pod
https://github.com/BAI-Yeqi/PyTorch-Verification
using this script gives this error
pod id: uy54q2udx0jbcw (deleted)
GitHub
GitHub - BAI-Yeqi/PyTorch-Verification
Contribute to BAI-Yeqi/PyTorch-Verification development by creating an account on GitHub.

3 Replies
What kind or what template do you use?
Use a newer cuda version I guess
I notices that on some instances when i'm taking multiple rtx 4090 cuda is not working - even for first cudaGetDeviceCount it is giving me 999 (unknown error). on other instances all is working fine. maybe drivers are out of date, or some system configuration is wrong, not figured out yet
open a support ticket with your pod id