R
RunPod•2w ago
Jamb

Runpod occasionally fails to pull from ECR

Every now and again I have issues starting a pod as it fails to pull from AWS ECR. Nothing in my setup changes.
error pulling image: Error response from daemon: Head "https://<AWS_ACCOUNT>.dkr.ecr.<region>.amazonaws.com/v2/<repo>/manifests/<container>": no basic auth credentials
error creating container: container: create: container create: Error response from daemon: No such image: <AWS_ACCOUNT>.dkr.ecr.<region>.amazonaws.com/<repo>:<container>
create container <aws_account>.dkr.ecr.<region>.amazonaws.com/<repo>:<container>
error creating container: container: create: container create: Error response from daemon: No such image: <aws_account>.dkr.ecr.<region>.amazonaws.com/<repo>:<container>
error pulling image: Error response from daemon: Head "https://<AWS_ACCOUNT>.dkr.ecr.<region>.amazonaws.com/v2/<repo>/manifests/<container>": no basic auth credentials
error creating container: container: create: container create: Error response from daemon: No such image: <AWS_ACCOUNT>.dkr.ecr.<region>.amazonaws.com/<repo>:<container>
create container <aws_account>.dkr.ecr.<region>.amazonaws.com/<repo>:<container>
error creating container: container: create: container create: Error response from daemon: No such image: <aws_account>.dkr.ecr.<region>.amazonaws.com/<repo>:<container>
6 Replies
Jamb
JambOP•2w ago
I also have a bunch of entries in Container Registry Auth and I can't delete them.
Dj
Dj•2w ago
I'm under the impression ECR tokens entirely expire after 12 hours - this came up as a feature request for us to streamline this process last night.
Jamb
JambOP•2w ago
They do, however with dstack, I automatically take care of this process and make sure I have a valid ECR password every time. Looking into it, I looks like dstack creates Container Registry Auth entries with the username and password and then links the pull command to the correct entry. This could an issue on their end, as it's not cleaning up old entries. I now have 6 of them (with the docs stating I can only have a max of 4) and I can't manually delete any of them. 😦 Update: I was able to delete my entries via API. We worked with dstack and believe the issue is due to graphQL not having an input for registry auth on pods creation/start, so you have to edit the pod after and assign the registry auth...and we think it's a weird timing issue. Looks like the REST API has this input , so they will change their automation to use REST to see if it resolves this issue As per this message , it looks like the containerRegistryAuthId input field for the REST API isn't working properly. We will have to wait for a fix on runpod's side before we can test if that's the actual issue or not.
Dj
Dj•2w ago
@nathaniel This ones all you^
nathaniel
nathaniel•2w ago
debugging this now will update you when fix is found
Jamb
JambOP•2w ago
Thank you!

Did you find this page helpful?