C
Coder.com•3y ago
Mr_Neezer

Prevent destroying and recreating home volume on workspace update

Using this example template: https://github.com/coder/coder/tree/main/examples/templates/do-linux When I make a change to cloud-config.tftpl, the volume that houses my user's home directory gets destroyed and recreated each time I update the workspace to the newest version of my template. How can I prevent this? I'd like to be able to push a new version of the template that, say, installs some additional program, and have the user be able to update their workspace to the new version without blasting whatever the user was working on in their home directory.
GitHub
coder/examples/templates/do-linux at main · coder/coder
Remote development environments on your infrastructure provisioned with Terraform - coder/examples/templates/do-linux at main · coder/coder
47 Replies
kyle
kyle•3y ago
You can use the Terraform ignore_changes built-in: https://www.terraform.io/language/meta-arguments/lifecycle#ignore_changes
Terraform by HashiCorp
The lifecycle Meta-Argument - Configuration Language | Terraform by...
The meta-arguments in a lifecycle block allow you to customize resource behavior.
Mr_Neezer
Mr_Neezer•3y ago
ignore_changes takes an array... what values should I be populating that with? Using the example template linked above... can't figure out what values to put in there. When I push my template update, I see digitalocean_volume.home_volume: Plan to create in the output... if I understood why Terraform plans to create it, I imagine I could use that value in ignore_changes, right? Is there a way to get Coder/Terraform to print that information somehow? Apologies if that's a stupid question; still very new to Terraform.
kyle
kyle•3y ago
All good! ignore_changes = all will probably do it
Mr_Neezer
Mr_Neezer•3y ago
Do I need to pair that with prevent_destroy = true, or is ignore_changes = all good enough on its own?
kyle
kyle•3y ago
ignore_changes = all should be good enough on its own
Mr_Neezer
Mr_Neezer•3y ago
Ok, trying to push a new template with that change... will report back in a few minutes. Hmm, no joy. I added that inside the resource block for digitalocean_volume" "home_volume"... is that the right location? The log output when pushing the template still says digitalocean_volume.home_volume: Plan to create and the volume is destroyed/created when updating the workspace. This is what it looks like right now:
resource "digitalocean_volume" "home_volume" {
region = var.region
name = "coder-${data.coder_workspace.me.owner}-${data.coder_workspace.me.name}-home"
size = var.home_volume_size
initial_filesystem_type = "ext4"
initial_filesystem_label = "coder-home"

lifecycle {
ignore_changes = all
}
}
resource "digitalocean_volume" "home_volume" {
region = var.region
name = "coder-${data.coder_workspace.me.owner}-${data.coder_workspace.me.name}-home"
size = var.home_volume_size
initial_filesystem_type = "ext4"
initial_filesystem_label = "coder-home"

lifecycle {
ignore_changes = all
}
}
... or should I be putting that in "digitalocean_project_resources" "project" instead? Just now noticing that digitalocean_volume.home_volume.urn is listed as a resource dependency there...?
kyle
kyle•3y ago
It's weird that's recreated... I guess try that for the project resources hmm
Mr_Neezer
Mr_Neezer•3y ago
@kyle Hey, circling back to this... I tried relocating the lifecycle call to project resources, and no difference. Would it be helpful to post my main.tf in a gist in its entirety so you can see the whole thing? I'd really like to get this figured out.
kyle
kyle•3y ago
Yup, I'm happy to help!
Mr_Neezer
Mr_Neezer•3y ago
@kyle Thanks! Here's the gist for main.tf: https://gist.github.com/neezer/82a0b4b74bce54d73b24b7d6c2ad71d2
Gist
do-droplet Coder template
do-droplet Coder template. GitHub Gist: instantly share code, notes, and snippets.
kyle
kyle•3y ago
Could you share the Terraform output as well?
Mr_Neezer
Mr_Neezer•3y ago
Sure, one sec Gist updated to include Terraform output. @kyle ^^
kyle
kyle•3y ago
Ahh ty So on template push you'd expect a home volume to be created, right? Could you send me the output from a workspace create and stop/start please?
Mr_Neezer
Mr_Neezer•3y ago
So on template push you'd expect a home volume to be created, right?
No, I don't think so. I think I'd expect a new volume to be created only on workspace create.
Could you send me the output from a workspace create and stop/start please?
Yeah, I'll add it to the gist in a bit. I'll ping ya here when it's up.
kyle
kyle•3y ago
So resources are never created on template push, they are just planned.
Mr_Neezer
Mr_Neezer•3y ago
That's my understanding, yes.
kyle
kyle•3y ago
@Mr_Neezer following up on this, in the gist it doesn't show the home volume being deleted on stop.
Mr_Neezer
Mr_Neezer•3y ago
@kyle Thanks for the follow-up! Sorry, been busy trying to figure out this automatic CSR issue... I've been testing whether or not data persists by creating a text file with some random string in my home directory, then start/stopping the instance. When I log back in using the Terminal and/or SSH, the file is gone. I still owe you logs for start and create; once I get my recent CSR changes sorted, I'll update the gist and ping you with those too. @kyle Ok, gist is updated with command output from create, stop, and start. main.tf has also been updated to show my latest setup. Additionally, the mounts section in my cloud-init looks like this:
mounts:
- [
"LABEL=${home_volume_label}",
"/home/${username}",
auto,
"defaults,uid=1000,gid=1000",
]
mounts:
- [
"LABEL=${home_volume_label}",
"/home/${username}",
auto,
"defaults,uid=1000,gid=1000",
]
My primary test is this: 1. SSH into the workspace. Run echo "testing persistence..." > ~/test.txt 2. Stop the workspace. 3. Start the workspace. 4. cat ~/test.txt -> cat: test.txt: No such file or directory Additionally, it appears all my initial provision scripts are re-run each time I stop/start as well. Any ideas what I'm doing wrong here? @kyle I see the following in the stop/start logs:
digitalocean_volume.home_volume: Drift detected (update)
Could that be related? What does Terraform consider "drift?"
kyle
kyle•3y ago
So it's not recreating it based on the output. Are you certain it's being mounted correctly?
Mr_Neezer
Mr_Neezer•3y ago
I didn't change anything from the coder/coder example template for do-linux as far as mounting goes (at least, not to my knowledge). Is there a way I can diagnose whether the mount happened successfully or not?
kyle
kyle•3y ago
Try typing mount inside a workspace to see. And toss the output in here
kyle
kyle•3y ago
Seems like the home volume isn't being mounted for some reason. Try catting the logs in /var/log/cloud-init.log
Mr_Neezer
Mr_Neezer•3y ago
Yeah, looks like it. One sec, catting the clout init logs...
Mr_Neezer
Mr_Neezer•3y ago
Oo:
Stderr: mount: /home/evan: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error.
kyle
kyle•3y ago
Welp, interesting
Mr_Neezer
Mr_Neezer•3y ago
I pasted the mounts section of cloud-init further up this thread, if you missed that. Didn't change that from the do-linux example repo, though.
kyle
kyle•3y ago
Hmm, it's not impossible that ours is outdated.
Mr_Neezer
Mr_Neezer•3y ago
True. I opened up the volume in my DO dashboard, and see this for config instructions:
# Create a mount point for your volume:
$ mkdir -p /mnt/coder_evan_dev_home

# Mount your volume at the newly-created mount point:
$ mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home

# Change fstab so the volume will be mounted after a reboot
$ echo '/dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab
# Create a mount point for your volume:
$ mkdir -p /mnt/coder_evan_dev_home

# Mount your volume at the newly-created mount point:
$ mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home

# Change fstab so the volume will be mounted after a reboot
$ echo '/dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab
Do those options in fstab jibe with what I'm doing in the mounts section of cloud-init?
kyle
kyle•3y ago
I think so... but maybe instead of auto lets explicitly put ext4?
Mr_Neezer
Mr_Neezer•3y ago
Ok, giving that a try.
kyle
kyle•3y ago
You can try running mount -o defaults,uid=1000,gid=1000 /dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /home/evan to do it manually But that should essentially be the same as the clout init
Mr_Neezer
Mr_Neezer•3y ago
Interesting. That didn't work, but the error in cloud-init logs changed:
Stderr: mount: /home/evan: can't find LABEL=coder-home.
That might be related to a change I tried last night, which was to try to use the filesystem_label attribute instead of the initial_filesystem_label attribute. Lemme put that back and try again. While I'm waiting for that to rebuild, an adjacent question: for cloud-init, am I guaranteed to have the mounts resolved by the time I reach the writefiles and/or runcmd sections? Or is the order not guaranteed?
kyle
kyle•3y ago
I believe so, but I'm not certain we'd have to double check.
Mr_Neezer
Mr_Neezer•3y ago
Assuming the mounts are working as expected, of course.
kyle
kyle•3y ago
Stack Overflow
cloud-init: What is the execution order of cloud-config directives?
What is the order of the directives in the cloud-config section of a cloud-init user-data object. This is important to avoid race type conditions. I know bootcmd runs early and before runcmd, but...
kyle
kyle•3y ago
Seems to be accurate
Mr_Neezer
Mr_Neezer•3y ago
Ok, rebuilt using initial_filesystem_label (as y'all are doing in the do-linux example), and I'm still getting this error:
Stderr: mount: /home/evan: can't find LABEL=coder-home.
Thoughts? Seems to be the value of digitalocean_volume.home_volume.initial_filesystem_label Hmm, maybe this is related? I tried running doctl projects resources list MY_DO_PROJECT_ID, and I see my domain (DNS), my kubernetes cluster, my droplet, and my hosted database... but no volumes. Maybe the volume isn't being correctly assigned to the DO project? Would that cause issues? The DO dashboard does show the volume as being assigned to the droplet Coder created, though. I tried merging the config docs from above with my mounts block, changing LABEL=${home_volume_label} to /dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} (and making the supporting changes in main.tf), and now I'm back to the previous error:
Stderr: mount: /home/evan: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error.
That seems like progress, since the last error was effectively "I can't find the volume!" and now it's like "This volume you gave me looks whack!" -- at least AFAICT I'm going to try your earlier suggestion: https://discord.com/channels/747933592273027093/1028039157618069524/1030150002489688144
Mr_Neezer
Mr_Neezer•3y ago
How to use cloud-init to mount block storage that's already formatt...
I’m using Ubuntu 16.04 on a droplet using Terraform. I have an existing volume that’s already been formatted that I would like to mount to /home so I can pe…
Mr_Neezer
Mr_Neezer•3y ago
My droplet is Debian 11, not Ubuntu, but possible its a related issue. @kyle Performing the mount manually in runcmd worked:
runcmd:
- mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username}
- echo '/dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username} ext4 defaults,uid=1000,gid=1000,nofail,discard 0 0' | tee -a /etc/fstab
runcmd:
- mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username}
- echo '/dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username} ext4 defaults,uid=1000,gid=1000,nofail,discard 0 0' | tee -a /etc/fstab
Tested through a full stop/start cycle, and no errors in cloud-init log, I see the mount in mount output, and files I save to my home directory now persist. Not sure why the mounts module wasn't working as expected; this is the second cloud-init module that doesn't seem to work as documented (I also had issues with the ansible module). Not leaving a favorable impression of cloud-init, but I'm happy the issue is resolved. Thanks again for all your help!
kyle
kyle•3y ago
Absolutely wonderful!
Mr_Neezer
Mr_Neezer•3y ago
@kyle Weeeellll, may have spoken a bit too soon. The above does work as I described, but I'm noticing now that when I stop my workspace, my old droplet is destroyed (good) but a new droplet is created (wha??) From the very end of the logs for a "stop":
Apply complete! Resources: 4 added, 1 changed, 4 destroyed.
Looks like it basically just cycled the workspace instead of actually stopping it. Ahh, ok. I had removed all the references to count... turns out those are needed, otherwise the droplet is still listed in the project resource list. Adding that back in and the droplet doesn't get re-created when stopping. Still not 💯 sure I understand what count is though; could you explain?
kyle
kyle•3y ago
count determines whether the resource is alive when we perform a stop. It's kinda weird, because start/stop is a Coder-specific primitive, not something in Terraform. Essentially, we provide a helper of data.coder_workspace.<name>.start_count to apply on resources that will only be 1 when the transition is start. This is to allow some resources to destroy on stop for cost savings.
Mr_Neezer
Mr_Neezer•3y ago
Thanks for the clarification, @kyle . I've a related question: I'm trying to configure a db firewall resource (https://registry.terraform.io/providers/digitalocean/digitalocean/latest/docs/resources/database_firewall) that uses the ID of the droplet created:
rule {
type = "droplet"
value = digitalocean_droplet.workspace[0].id
}
rule {
type = "droplet"
value = digitalocean_droplet.workspace[0].id
}
This consistently gives me this error on template push:
Error: Invalid index The given key does not identify an element in this collection value: the collection has no elements.
I've tried adding depends_on = [digitalocean_droplet.workspace] but that doesn't seem to make a difference. This feels related to the count logic from above... is it?
kyle
kyle•3y ago
It is! You'll need to add the count property to the database_firewall resource as well.
Mr_Neezer
Mr_Neezer•3y ago
Do I need to change the value in my snippet above, or is just adding the count property sufficient? Nm, I don't. Things look like they're working as-expected now. Thanks @kyle !
Want results from more Discord servers?
Add your server