Devcontainer Template Ignores GPU Limits (All GPUs Visible)
Hey folks 👋
I'm running into an issue with GPU isolation in two different Kubernetes-based Coder templates:
Template A: Uses the Kubernetes (Deployment) template → GPU isolation works as expected. If I select 1 GPU, the container only sees 1 via nvidia-smi.
Template B: Uses the Kubernetes (Devcontainer) template → Even when I select 1 GPU, the container sees all available GPUs on the host.
Both templates configure GPU resources like this:
One key difference is that in the Devcontainer template, I had to add the following to the
security_context
:
I suspect this might be allowing the container to bypass Kubernetes’ GPU isolation, but I’m not sure how to safely lock it down and still allow the build process to succeed.
Has anyone dealt with this before? Is there a way to use envbuilder + GPU isolation without needing to run as root/privileged?
Any pointers would be much appreciated 🙏7 Replies
<#1354767638777172029>
Category
Help needed
Product
Coder (v2)
Platform
Linux
Logs
Please post any relevant logs/error messages.
hey, what made you need to add
privileged = true
and running as root?
this shouldn't be needed at all and is very insecure, i also think that's what causing your GPU isolation issue
could you provide error messages if any?Hey @Phorcys, thanks for the reply!
I needed to add
privileged = true
and run_as_user = 0
because without them, the build process was failing with errors related to file access and GPU initialization. Specifically, I was getting errors like:
It seems that without privileged mode, the container couldn’t access the necessary GPU resources during the build process.
From your experience, do you think there’s a way to configure the GPU access more securely without using privileged mode? I suspect that the privileged setting is indeed causing the GPU isolation issue, but I’m not sure how to bypass the file access issues without it.
Full logs attachedI saw the idea of using
priviledged = true
here: https://github.com/coder/envbuilder/issues/143#issuecomment-2192405828GitHub
Investigate GPU support · Issue #143 · coder/envbuilder
Some users will want to mount a GPU to an envbuilder-backed workspace. Can we investigate in which scenarios (if any) this works today and if/how we can patch upstream Kaniko to improve the experie...
cc @Atif
i think there's some nvidia images you can use with the drivers preinstalled, which could fix it but unsure
i'm on the go right now but I should be able to get back to you later today once i settle down
Great! Any kind of guidance here would be very much appreciated 😉
Hey folks, just to add some context, I'm using the repo
https://github.com/BrunoQuaresma/envbuilder-gpu-test
with the init script configured as /tmp/vectorAdd
(as suggested in the github issue) . I'm encountering the following error during the build process:
fyi, it probably won't be today as i was out late, expect a reply this week and ping me if i forget
(we are at KubeCon EU)