"The agent cannot authenticate until the workspace provision job has been completed."

I am trying to provision a Coder agent on a VM. I am doing this by creating a systemd unit that runs the Coder agent. However, when the unit starts, as part of a remote exec provisioner, it just keeps repeating this:
Aug 29 07:07:09 mike9-services.redacted coder[7482]: 2024-08-29 07:07:09.252 [info] connecting to coderd
Aug 29 07:07:09 mike9-services.redacted coder[7482]: 2024-08-29 07:07:09.257 [warn] run exited with error ...
Aug 29 07:07:09 mike9-services.redacted coder[7482]: error= GET https://coder.redacted/api/v2/workspaceagents/me/rpc?version=2.2: unexpected status code 401: Workspace agent not authorized.: Try logging in using >
Aug 29 07:07:09 mike9-services.redacted coder[7482]: Error: The agent cannot authenticate until the workspace provision job has been completed. If the job is no longer running, this agent is invalid.
Aug 29 07:07:09 mike9-services.redacted coder[7482]: 2024-08-29 07:07:09.252 [info] connecting to coderd
Aug 29 07:07:09 mike9-services.redacted coder[7482]: 2024-08-29 07:07:09.257 [warn] run exited with error ...
Aug 29 07:07:09 mike9-services.redacted coder[7482]: error= GET https://coder.redacted/api/v2/workspaceagents/me/rpc?version=2.2: unexpected status code 401: Workspace agent not authorized.: Try logging in using >
Aug 29 07:07:09 mike9-services.redacted coder[7482]: Error: The agent cannot authenticate until the workspace provision job has been completed. If the job is no longer running, this agent is invalid.
Restarting the systemd unit has no effect. The token is being passed through from the coder_agent resource (coder_agent.main.token). With coder state pull I can see that the token on the agent resource is correct compared to the one on disk, but I cannot get this agent to associated. I've tried adding a coder_agent_instance but it doesn't help. I see an old closed issue https://github.com/coder/coder/issues/5704 as being similar, but there's no clear fix present there. Perhaps worth mentioning: - Using an AWS instance, but in a different account, so using token auth, not aws instance identity - AWS instance is created by a module that is intended to work independently of Coder; seems to pose some issues for knitting the agent together with the instance, couldn't get coder_metadata to work either - Actual instance is spun up fine and seems happy except for the agent connection
GitHub
Unable to run coder_agent on virtual machine created with vSphere p...
I am using the vSphere plugin to clone virtual machines from templates on vSphere. After cloning, I used remote-exec provisioner and did an init of coder_agent and got the following error null_reso...
8 Replies
Codercord
Codercord5mo ago
<#1278613644657033236>
Category
Bug report
Product
Coder OSS (v2)
Platform
Linux
Logs
Please post any relevant logs/error messages.
Phorcys
Phorcys5mo ago
hey @Plotly Mike -- do you still have the issue? and could you share the template you're using?
Plotly Mike
Plotly MikeOP5mo ago
I have pushed on in a bunch of different directions on this reorganizing my TF so I can't recreate right now - once it's back in working order I will see if i've solved it myself or not. It sort-of feels like it was related to my use of TF modules (the EC2 instance was in a submodule), but I'm not sure. I will reply back when I have an update. Thanks!! I figured it out - it looks like doing counts on the resources associated with the agent doesn't work very well. I had been hoping to make it optional to deploy an agent or not (the Coder template I am writing is mainly about applying network infrastructure, long story!) and adding a VM to it that runs a few services is an optional addon. I'm just making the VM mandatory, since when it's optional+enabled the Agent still doesn't show up at all in the UI and can't be properly joined to by the VM. Some improvements could definitely be done around this support, and for attaching coder_metadata in a more consistent fashion.
Phorcys
Phorcys4mo ago
if you don't need the Coder part at all times, I think you should look into putting all the Coder-related stuff into a Terraform Module that you source or not but keep in mind that if the agent is optional then it won't work within Coder ;-)
Phorcys
Phorcys4mo ago
Creating Modules | Terraform | HashiCorp Developer
Creating Modules | Terraform | HashiCorp Developer
Modules are containers for multiple resources that are used together in a configuration. Learn when to create modules and about module structure.
Phorcys
Phorcys4mo ago
we manage threads in this #help channels just like issues/tickets, can I mark this one as resolved?
Plotly Mike
Plotly MikeOP4mo ago
Yes, I think so. I do think there are some foibles around the use of modules in Coder right now, but I don't have any outstanding clear ask. For what it's worth: yes, I do need the Coder part (as in, a tool to give my developers an ability to one-click spin up a bunch of resources) - what I don't always need is a VM/container to run the Coder Agent within.
Phorcys
Phorcys4mo ago
hmm -- I don't really think Coder would be the right thing to use if you really don't need a VM so, maybe just give them one, or use something like an Ansible playbook on an Ansible Tower/AWX server to run those

Did you find this page helpful?