Alexandre TL
ERR_NVGPUCTRPERM when profiling CUDA kernels
I'm trying to profile CUDA kernels with NCU and I encountered this error due to a said lack of permission :
"ERR_NVGPUCTRPERM - The user does not have permission to access NVIDIA GPU Performance Counters on the target device 0. For instructions on enabling permissions and to get more information see https://developer.nvidia.com/ERR_NVGPUCTRPERM"
on the linked website, it is said that when profiling kernels on containers (which is the case here with pods right?), one has to launch the container with --cap-add=SYS_ADMIN but I'm not sure this is possible with Runpod pods.
Have you find a workaround ? Surely there is a way to profile kernels on container GPUs ?
Thank you
8 replies