AWS EC2 (Devcontainer) fails with custom repo
Replacing the starting repo withhttps://github.com/KyleDavisDev-Preply/build-tools/commit/0ada6f0b3c28ad46fb570d3394d551eda34641b8 causes the coder agent to fail.
The logs freeze at step 2 taking a snapshot... then everything becomes unresponsive till it becomes unhealthy.
18 Replies
<#1286135473429020792>
Category
Help needed
Product
Coder OSS (v2)
Platform
Linux
Logs
Please post any relevant logs/error messages.
What version of Coder are you using here?
Can you also provide logs from the workspace container when the workspace is in this frozen/unhealthy state?
The devlogs above are the logs that I get.
I'm using the latest docker compose version of coder
@JustATempest are you passing
https://github.com/KyleDavisDev-Preply/build-tools/commit/0ada6f0b3c28ad46fb570d3394d551eda34641b8
as an argument to the template?
or just https://github.com/KyleDavisDev-Preply/build-tools
?
please check the /health
page to get the versionThis
Sure ... I've been fighting for 4 hours today... I'll be taking a break today. Send it some time tomorrow.... 🥱
i'll also do some testing on my own to see if i can reproduce the issue, likely tomorrow :-)
7e7fbc39-bbd9-467b-ba17-53f9a8b74d53
https://github.com/KyleDavisDev-Preply/build-tools/ref/head/c_or_cpp
this is the repo url for now
it's minimal and failing
Templates I'm using for reproducing the issue.
I've done some testing and the following repo urls work on my end with another template:
- https://github.com/KyleDavisDev-Preply/build-tools
- https://github.com/KyleDavisDev-Preply/build-tools#c_or_cpp
i think this likely has something to do with the template but I don't really know what envbuilder is supposed to be doing on the "taking a snapshot" step, do you have any ideas on how to troubleshoot @Cian?
At the 'taking a snapshot' it's basically hashing all of the files and based on the template you shared above, pushing the images to the remote cache. Do you have container logs from this stage? Meanwhile, I'm trying out your template on my own deployment.
That's the problem. It freezes on taking the snapshot step. Then I lose all connection on coder. Logging stops. I'm going to see if there's a flag I can add to pipe the logs from ENV builder into a file on the EC2 host.
I run coder using docker compose. I'll send those logs as well as the logs from the container running on the EC2 instance. As soon as I gather them up.
is there maybe a flag we can set to toggle debug/verbose logs in envbuilder?
ENVBUILDER_VERBOSE=true
nevermind. Ingore this if you did not see the deleted message....
ok so I think this is what we are looking for
Interesting... there may be some more detailed info in the cloud_init logs on the instance, but that's going to be annoying to get to without an SSH key in the instance.
FYI @JustATempest I updated the aws-devcontainer template: https://github.com/coder/coder/tree/4be5b2f/examples/templates/aws-devcontainer
I haven't been able to reproduce the hang you have been describing but I did add the facility to insert an SSH key into the VM so you can at least log in and poke around if things don't connect.
My experience with "taking a snapshot" is that it can sometimes take a while to complete though.
I'll see about writing up steps to reproduce