Sean Zhang
Sean Zhang
RRunPod
Created by Sean Zhang on 12/15/2024 in #⛅|pods
Enable performance counter on runpod
Hi, I'm trying to profile some CUDA kernels on a pod with A100 in order to improve its performance. Is there a way to enable the performance counters as per https://developer.nvidia.com/nvidia-development-tools-solutions-err_nvgpuctrperm-permission-issue-performance-counters on pods? I've tried to enable it by creating necessary config files on /etc/modprobe.d but no avail It seems that the permission needs to be enabled on the host
When profiling within a container, access must be enabled on the host, or the container must be started with the appropriate permissions by passing --cap-add=SYS_ADMIN as an admin user.
When profiling within a container, access must be enabled on the host, or the container must be started with the appropriate permissions by passing --cap-add=SYS_ADMIN as an admin user.
Happy to provide more details and even a temporary grant of permission is sufficient. Thanks!
5 replies