EKS/ECR envbuilder layer cache
I'm trying to setup devcontainer layer caching. I started out with the aws-devcontainer starter template, and I have a repo in ECR which I have filled in to the "cache_repo" variable. But when I start the workspace, I see the following:
As this is coming from terraform, this is running in the coder pod which is running with the "coder" service account. I have a pod identity association that should be giving this service account access to ECR, with full read access and write to the envbuilder-cache repo.
I had a hypothesis that the pod identity association was not sufficient to access ECR, only to retrieve credentials. So I adjusted the template to add a
data "aws_ecr_authorization_token"
and to use that to render a docker_config_base64 for the "envbuilder_cached_image":
I can see with coder state pull
that it is getting an authorization token. Yet the 401 error persists.
Anything I should be checking?7 Replies
<#1357846227366056027>
Category
Help needed
Product
Coder (v2)
Platform
Linux
Logs
Please post any relevant logs/error messages.
This is the relevant portion of the template
hey, a similar issue has been reported in the past
let me find it
https://discord.com/channels/747933592273027093/1286376282984026226/1286710187033628763
ah, my bad, it seems that it's pretty much the same as your template
what IAM permissions did you set for the service account?
Sort of the same. It is building the credentials the same way. But that example is giving the credentials to envbuilder. I'm trying to give the credentials to
resource "envbuilder_cached_image"
.
One guess I had: Maybe the terraform resource isn't using the credentials to "fetch the envbuilder binary from the builder image", but only for accessing the cache repo?Yeah, I think that might be it: docker_config_base64 is passed into envbuilder's config, but it's not used when fetching envbuilder from the builder_image. The helper function GetRemoteImage(), uses
authn.DefaultKeychain
, which reads from ~/.docker/config.json
et al.
https://github.com/coder/terraform-provider-envbuilder/blob/main/internal/imgutil/imgutil.go#L27
https://github.com/google/go-containerregistry/blob/main/pkg/authn/keychain.go#L87
I realize I haven't said or made clear in my snippets: the builder_image is in a private repository (b/c I added some files there that we want available during devcontainer builds).
Two workarounds come to mind:
- Use the public image for builder_image in this resource--we don't actually need our modified image just to check the cache.
- Modify the coder deployment to put credentials in an appropriate place to be read by GetRemoteImage--actually, this isn't a good solution because ECR credentials expire every 12 hours; though I suppose I could complicate it further by adding a process to refresh them.i'm not sure, @Atif do you have any ideas?