Prevent destroying and recreating home volume on workspace update
Using this example template: https://github.com/coder/coder/tree/main/examples/templates/do-linux
When I make a change to
cloud-config.tftpl
, the volume that houses my user's home directory gets destroyed and recreated each time I update the workspace to the newest version of my template.
How can I prevent this? I'd like to be able to push a new version of the template that, say, installs some additional program, and have the user be able to update their workspace to the new version without blasting whatever the user was working on in their home directory.GitHub
coder/examples/templates/do-linux at main · coder/coder
Remote development environments on your infrastructure provisioned with Terraform - coder/examples/templates/do-linux at main · coder/coder
47 Replies
You can use the Terraform
ignore_changes
built-in: https://www.terraform.io/language/meta-arguments/lifecycle#ignore_changesTerraform by HashiCorp
The lifecycle Meta-Argument - Configuration Language | Terraform by...
The meta-arguments in a lifecycle block allow you to customize resource behavior.
ignore_changes
takes an array... what values should I be populating that with? Using the example template linked above... can't figure out what values to put in there.
When I push my template update, I see digitalocean_volume.home_volume: Plan to create
in the output... if I understood why Terraform plans to create it, I imagine I could use that value in ignore_changes
, right? Is there a way to get Coder/Terraform to print that information somehow?
Apologies if that's a stupid question; still very new to Terraform.All good!
ignore_changes = all
will probably do itDo I need to pair that with
prevent_destroy = true
, or is ignore_changes = all
good enough on its own?ignore_changes = all
should be good enough on its ownOk, trying to push a new template with that change... will report back in a few minutes.
Hmm, no joy. I added that inside the resource block for
digitalocean_volume" "home_volume"
... is that the right location?
The log output when pushing the template still says digitalocean_volume.home_volume: Plan to create
and the volume is destroyed/created when updating the workspace.
This is what it looks like right now:
... or should I be putting that in "digitalocean_project_resources" "project"
instead? Just now noticing that digitalocean_volume.home_volume.urn
is listed as a resource dependency there...?It's weird that's recreated... I guess try that for the project resources hmm
@kyle Hey, circling back to this... I tried relocating the
lifecycle
call to project resources, and no difference.
Would it be helpful to post my main.tf
in a gist in its entirety so you can see the whole thing? I'd really like to get this figured out.Yup, I'm happy to help!
@kyle Thanks! Here's the gist for
main.tf
: https://gist.github.com/neezer/82a0b4b74bce54d73b24b7d6c2ad71d2Gist
do-droplet Coder template
do-droplet Coder template. GitHub Gist: instantly share code, notes, and snippets.
Could you share the Terraform output as well?
Sure, one sec
Gist updated to include Terraform output.
@kyle ^^
Ahh ty
So on template push you'd expect a home volume to be created, right? Could you send me the output from a workspace create and stop/start please?
So on template push you'd expect a home volume to be created, right?No, I don't think so. I think I'd expect a new volume to be created only on workspace create.
Could you send me the output from a workspace create and stop/start please?Yeah, I'll add it to the gist in a bit. I'll ping ya here when it's up.
So resources are never created on template push, they are just planned.
That's my understanding, yes.
@Mr_Neezer following up on this, in the gist it doesn't show the home volume being deleted on stop.
@kyle Thanks for the follow-up! Sorry, been busy trying to figure out this automatic CSR issue... I've been testing whether or not data persists by creating a text file with some random string in my home directory, then start/stopping the instance. When I log back in using the Terminal and/or SSH, the file is gone.
I still owe you logs for start and create; once I get my recent CSR changes sorted, I'll update the gist and ping you with those too.
@kyle Ok, gist is updated with command output from create, stop, and start.
main.tf
has also been updated to show my latest setup.
Additionally, the mounts section in my cloud-init looks like this:
My primary test is this:
1. SSH into the workspace. Run echo "testing persistence..." > ~/test.txt
2. Stop the workspace.
3. Start the workspace.
4. cat ~/test.txt
-> cat: test.txt: No such file or directory
Additionally, it appears all my initial provision scripts are re-run each time I stop/start as well. Any ideas what I'm doing wrong here?
@kyle I see the following in the stop/start logs:
digitalocean_volume.home_volume: Drift detected (update)Could that be related? What does Terraform consider "drift?"
So it's not recreating it based on the output. Are you certain it's being mounted correctly?
I didn't change anything from the
coder/coder
example template for do-linux
as far as mounting goes (at least, not to my knowledge). Is there a way I can diagnose whether the mount happened successfully or not?Try typing
mount
inside a workspace to see.
And toss the output in hereSeems like the home volume isn't being mounted for some reason. Try catting the logs in
/var/log/cloud-init.log
Yeah, looks like it. One sec, catting the clout init logs...
Oo:
Stderr: mount: /home/evan: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error.
Welp, interesting
I pasted the
mounts
section of cloud-init further up this thread, if you missed that.
Didn't change that from the do-linux
example repo, though.Hmm, it's not impossible that ours is outdated.
True.
I opened up the volume in my DO dashboard, and see this for config instructions:
Do those options in
fstab
jibe with what I'm doing in the mounts
section of cloud-init?I think so... but maybe instead of
auto
lets explicitly put ext4
?Ok, giving that a try.
You can try running
mount -o defaults,uid=1000,gid=1000 /dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /home/evan
to do it manually
But that should essentially be the same as the clout initInteresting. That didn't work, but the error in cloud-init logs changed:
Stderr: mount: /home/evan: can't find LABEL=coder-home.That might be related to a change I tried last night, which was to try to use the
filesystem_label
attribute instead of the initial_filesystem_label
attribute. Lemme put that back and try again.
While I'm waiting for that to rebuild, an adjacent question: for cloud-init, am I guaranteed to have the mounts resolved by the time I reach the writefiles
and/or runcmd
sections? Or is the order not guaranteed?I believe so, but I'm not certain we'd have to double check.
Assuming the mounts are working as expected, of course.
Stack Overflow
cloud-init: What is the execution order of cloud-config directives?
What is the order of the directives in the cloud-config section of a cloud-init user-data object. This is important to avoid race type conditions.
I know bootcmd runs early and before runcmd, but...
Seems to be accurate
Ok, rebuilt using
initial_filesystem_label
(as y'all are doing in the do-linux
example), and I'm still getting this error:
Stderr: mount: /home/evan: can't find LABEL=coder-home.Thoughts? Seems to be the value of
digitalocean_volume.home_volume.initial_filesystem_label
Hmm, maybe this is related? I tried running doctl projects resources list MY_DO_PROJECT_ID
, and I see my domain (DNS), my kubernetes cluster, my droplet, and my hosted database... but no volumes.
Maybe the volume isn't being correctly assigned to the DO project? Would that cause issues?
The DO dashboard does show the volume as being assigned to the droplet Coder created, though.
I tried merging the config docs from above with my mounts
block, changing LABEL=${home_volume_label}
to /dev/disk/by-id/scsi-0DO_Volume_${home_volume_name}
(and making the supporting changes in main.tf
), and now I'm back to the previous error:
Stderr: mount: /home/evan: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error.That seems like progress, since the last error was effectively "I can't find the volume!" and now it's like "This volume you gave me looks whack!" -- at least AFAICT I'm going to try your earlier suggestion: https://discord.com/channels/747933592273027093/1028039157618069524/1030150002489688144
Seems like someone back in 2017 did roughly the same thing: see the accepted answer on https://www.digitalocean.com/community/questions/how-to-use-cloud-init-to-mount-block-storage-that-s-already-formatted-and-ready-to-mount
How to use cloud-init to mount block storage that's already formatt...
I’m using Ubuntu 16.04 on a droplet using Terraform. I have an existing volume that’s already been formatted that I would like to mount to /home so I can pe…
My droplet is Debian 11, not Ubuntu, but possible its a related issue.
@kyle Performing the mount manually in
runcmd
worked:
Tested through a full stop/start cycle, and no errors in cloud-init log, I see the mount in mount
output, and files I save to my home directory now persist.
Not sure why the mounts
module wasn't working as expected; this is the second cloud-init module that doesn't seem to work as documented (I also had issues with the ansible
module). Not leaving a favorable impression of cloud-init, but I'm happy the issue is resolved.
Thanks again for all your help!Absolutely wonderful!
@kyle Weeeellll, may have spoken a bit too soon. The above does work as I described, but I'm noticing now that when I stop my workspace, my old droplet is destroyed (good) but a new droplet is created (wha??)
From the very end of the logs for a "stop":
Apply complete! Resources: 4 added, 1 changed, 4 destroyed.Looks like it basically just cycled the workspace instead of actually stopping it. Ahh, ok. I had removed all the references to
count
... turns out those are needed, otherwise the droplet is still listed in the project resource list. Adding that back in and the droplet doesn't get re-created when stopping.
Still not 💯 sure I understand what count
is though; could you explain?count
determines whether the resource is alive when we perform a stop
. It's kinda weird, because start/stop is a Coder-specific primitive, not something in Terraform.
Essentially, we provide a helper of data.coder_workspace.<name>.start_count
to apply on resources that will only be 1
when the transition is start
. This is to allow some resources to destroy on stop
for cost savings.Thanks for the clarification, @kyle . I've a related question:
I'm trying to configure a db firewall resource (https://registry.terraform.io/providers/digitalocean/digitalocean/latest/docs/resources/database_firewall) that uses the ID of the droplet created:
This consistently gives me this error on template push:
Error: Invalid index The given key does not identify an element in this collection value: the collection has no elements.I've tried adding
depends_on = [digitalocean_droplet.workspace]
but that doesn't seem to make a difference. This feels related to the count
logic from above... is it?It is! You'll need to add the
count
property to the database_firewall
resource as well.Do I need to change the
value
in my snippet above, or is just adding the count
property sufficient?
Nm, I don't. Things look like they're working as-expected now. Thanks @kyle !