R
RunPod•9mo ago
DreamGen

UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda

This is a reocurring problem on RunPod. This time with 3090 -- tried 3 different pods in CA region (can't use US region because it has maintenance soon...). ID: wmwxn9onlckqus
root@fd08183704a5:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
root@fd08183704a5:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
root@fd08183704a5:~nvidia-smi
Sat Mar 16 07:26:26 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.1 |
|-------------------------------+----------------------+----------------------+
root@fd08183704a5:~nvidia-smi
Sat Mar 16 07:26:26 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.1 |
|-------------------------------+----------------------+----------------------+
root@fd08183704a5:~# python
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.1.1+cu121'
root@fd08183704a5:~# python
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.1.1+cu121'
Solution:
You need to use the CUDA filter to select the correct CUDA version. CUDA is not forwards compatible. You need to select a machine that matches the CUDA version of your Docker image. The machine can have a higher version then your Docker image but not a lower version. CUDA is backwards compatible but not forwards compatible.
Jump to solution
2 Replies
DreamGen
DreamGenOP•9mo ago
I switched to 12.3 machine and that worked in this case. In other cases it was the oppsite 😄
Solution
ashleyk
ashleyk•9mo ago
You need to use the CUDA filter to select the correct CUDA version. CUDA is not forwards compatible. You need to select a machine that matches the CUDA version of your Docker image. The machine can have a higher version then your Docker image but not a lower version. CUDA is backwards compatible but not forwards compatible.
Want results from more Discord servers?
Add your server