Coder.com•3y ago

Prevent destroying and recreating home volume on workspace update

Using this example template: https://github.com/coder/coder/tree/main/examples/templates/do-linux When I make a change to cloud-config.tftpl, the volume that houses my user's home directory gets destroyed and recreated each time I update the workspace to the newest version of my template. How can I prevent this? I'd like to be able to push a new version of the template that, say, installs some additional program, and have the user be able to update their workspace to the new version without blasting whatever the user was working on in their home directory.

GitHub

coder/examples/templates/do-linux at main · coder/coder

Remote development environments on your infrastructure provisioned with Terraform - coder/examples/templates/do-linux at main · coder/coder

47 Replies

kyle•3y ago

You can use the Terraform ignore_changes built-in: https://www.terraform.io/language/meta-arguments/lifecycle#ignore_changes

Terraform by HashiCorp

The lifecycle Meta-Argument - Configuration Language | Terraform by...

The meta-arguments in a lifecycle block allow you to customize resource behavior.

Mr_NeezerOP•3y ago

ignore_changes takes an array... what values should I be populating that with? Using the example template linked above... can't figure out what values to put in there. When I push my template update, I see digitalocean_volume.home_volume: Plan to create in the output... if I understood why Terraform plans to create it, I imagine I could use that value in ignore_changes, right? Is there a way to get Coder/Terraform to print that information somehow? Apologies if that's a stupid question; still very new to Terraform.

kyle•3y ago

All good! ignore_changes = all will probably do it

Mr_NeezerOP•3y ago

Do I need to pair that with prevent_destroy = true, or is ignore_changes = all good enough on its own?

kyle•3y ago

ignore_changes = all should be good enough on its own

Mr_NeezerOP•3y ago

Ok, trying to push a new template with that change... will report back in a few minutes. Hmm, no joy. I added that inside the resource block for digitalocean_volume" "home_volume"... is that the right location? The log output when pushing the template still says digitalocean_volume.home_volume: Plan to create and the volume is destroyed/created when updating the workspace. This is what it looks like right now:

resource "digitalocean_volume" "home_volume" {
  region                   = var.region
  name                     = "coder-${data.coder_workspace.me.owner}-${data.coder_workspace.me.name}-home"
  size                     = var.home_volume_size
  initial_filesystem_type  = "ext4"
  initial_filesystem_label = "coder-home"

  lifecycle {
    ignore_changes  = all 
  }
}

resource "digitalocean_volume" "home_volume" {
  region                   = var.region
  name                     = "coder-${data.coder_workspace.me.owner}-${data.coder_workspace.me.name}-home"
  size                     = var.home_volume_size
  initial_filesystem_type  = "ext4"
  initial_filesystem_label = "coder-home"

  lifecycle {
    ignore_changes  = all 
  }
}

... or should I be putting that in "digitalocean_project_resources" "project" instead? Just now noticing that digitalocean_volume.home_volume.urn is listed as a resource dependency there...?

kyle•3y ago

It's weird that's recreated... I guess try that for the project resources hmm

Mr_NeezerOP•3y ago

@kyle Hey, circling back to this... I tried relocating the lifecycle call to project resources, and no difference. Would it be helpful to post my main.tf in a gist in its entirety so you can see the whole thing? I'd really like to get this figured out.

kyle•3y ago

Yup, I'm happy to help!

Mr_NeezerOP•3y ago

@kyle Thanks! Here's the gist for main.tf: https://gist.github.com/neezer/82a0b4b74bce54d73b24b7d6c2ad71d2

Gist

do-droplet Coder template

do-droplet Coder template. GitHub Gist: instantly share code, notes, and snippets.

kyle•3y ago

Could you share the Terraform output as well?

Mr_NeezerOP•3y ago

Sure, one sec Gist updated to include Terraform output. @kyle ^^

kyle•3y ago

Ahh ty So on template push you'd expect a home volume to be created, right? Could you send me the output from a workspace create and stop/start please?

Mr_NeezerOP•3y ago

So on template push you'd expect a home volume to be created, right?

No, I don't think so. I think I'd expect a new volume to be created only on workspace create.

Could you send me the output from a workspace create and stop/start please?

Yeah, I'll add it to the gist in a bit. I'll ping ya here when it's up.

kyle•3y ago

So resources are never created on template push, they are just planned.

Mr_NeezerOP•3y ago

That's my understanding, yes.

kyle•3y ago

@Mr_Neezer following up on this, in the gist it doesn't show the home volume being deleted on stop.

Mr_NeezerOP•3y ago

@kyle Thanks for the follow-up! Sorry, been busy trying to figure out this automatic CSR issue... I've been testing whether or not data persists by creating a text file with some random string in my home directory, then start/stopping the instance. When I log back in using the Terminal and/or SSH, the file is gone. I still owe you logs for start and create; once I get my recent CSR changes sorted, I'll update the gist and ping you with those too. @kyle Ok, gist is updated with command output from create, stop, and start. main.tf has also been updated to show my latest setup. Additionally, the mounts section in my cloud-init looks like this:

mounts:
  - [
      "LABEL=${home_volume_label}",
      "/home/${username}",
      auto,
      "defaults,uid=1000,gid=1000",
    ]

mounts:
  - [
      "LABEL=${home_volume_label}",
      "/home/${username}",
      auto,
      "defaults,uid=1000,gid=1000",
    ]

My primary test is this: 1. SSH into the workspace. Run echo "testing persistence..." > ~/test.txt 2. Stop the workspace. 3. Start the workspace. 4. cat ~/test.txt -> cat: test.txt: No such file or directory Additionally, it appears all my initial provision scripts are re-run each time I stop/start as well. Any ideas what I'm doing wrong here? @kyle I see the following in the stop/start logs:

digitalocean_volume.home_volume: Drift detected (update)

Could that be related? What does Terraform consider "drift?"

kyle•3y ago

So it's not recreating it based on the output. Are you certain it's being mounted correctly?

Mr_NeezerOP•3y ago

I didn't change anything from the coder/coder example template for do-linux as far as mounting goes (at least, not to my knowledge). Is there a way I can diagnose whether the mount happened successfully or not?

kyle•3y ago

Try typing mount inside a workspace to see. And toss the output in here

Mr_NeezerOP•3y ago

message.txt

kyle•3y ago

Seems like the home volume isn't being mounted for some reason. Try catting the logs in /var/log/cloud-init.log

Mr_NeezerOP•3y ago

Yeah, looks like it. One sec, catting the clout init logs...

Mr_NeezerOP•3y ago

cloud-init.log

Mr_NeezerOP•3y ago

Oo:

Stderr: mount: /home/evan: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error.

kyle•3y ago

Welp, interesting

Mr_NeezerOP•3y ago

I pasted the mounts section of cloud-init further up this thread, if you missed that. Didn't change that from the do-linux example repo, though.

kyle•3y ago

Hmm, it's not impossible that ours is outdated.

Mr_NeezerOP•3y ago

True. I opened up the volume in my DO dashboard, and see this for config instructions:

# Create a mount point for your volume:
$ mkdir -p /mnt/coder_evan_dev_home

# Mount your volume at the newly-created mount point:
$ mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home

# Change fstab so the volume will be mounted after a reboot
$ echo '/dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab

# Create a mount point for your volume:
$ mkdir -p /mnt/coder_evan_dev_home

# Mount your volume at the newly-created mount point:
$ mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home

# Change fstab so the volume will be mounted after a reboot
$ echo '/dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /mnt/coder_evan_dev_home ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab

Do those options in fstab jibe with what I'm doing in the mounts section of cloud-init?

kyle•3y ago

I think so... but maybe instead of auto lets explicitly put ext4?

Mr_NeezerOP•3y ago

Ok, giving that a try.

kyle•3y ago

You can try running mount -o defaults,uid=1000,gid=1000 /dev/disk/by-id/scsi-0DO_Volume_coder-evan-dev-home /home/evan to do it manually But that should essentially be the same as the clout init

Mr_NeezerOP•3y ago

Interesting. That didn't work, but the error in cloud-init logs changed:

Stderr: mount: /home/evan: can't find LABEL=coder-home.

That might be related to a change I tried last night, which was to try to use the filesystem_label attribute instead of the initial_filesystem_label attribute. Lemme put that back and try again. While I'm waiting for that to rebuild, an adjacent question: for cloud-init, am I guaranteed to have the mounts resolved by the time I reach the writefiles and/or runcmd sections? Or is the order not guaranteed?

kyle•3y ago

I believe so, but I'm not certain we'd have to double check.

Mr_NeezerOP•3y ago

Assuming the mounts are working as expected, of course.

kyle•3y ago

https://stackoverflow.com/questions/34095839/cloud-init-what-is-the-execution-order-of-cloud-config-directives

Stack Overflow

cloud-init: What is the execution order of cloud-config directives?

What is the order of the directives in the cloud-config section of a cloud-init user-data object. This is important to avoid race type conditions. I know bootcmd runs early and before runcmd, but...

kyle•3y ago

Seems to be accurate

Mr_NeezerOP•3y ago

Ok, rebuilt using initial_filesystem_label (as y'all are doing in the do-linux example), and I'm still getting this error:

Stderr: mount: /home/evan: can't find LABEL=coder-home.

Thoughts? Seems to be the value of digitalocean_volume.home_volume.initial_filesystem_label Hmm, maybe this is related? I tried running doctl projects resources list MY_DO_PROJECT_ID, and I see my domain (DNS), my kubernetes cluster, my droplet, and my hosted database... but no volumes. Maybe the volume isn't being correctly assigned to the DO project? Would that cause issues? The DO dashboard does show the volume as being assigned to the droplet Coder created, though. I tried merging the config docs from above with my mounts block, changing LABEL=${home_volume_label} to /dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} (and making the supporting changes in main.tf), and now I'm back to the previous error:

Stderr: mount: /home/evan: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error.

That seems like progress, since the last error was effectively "I can't find the volume!" and now it's like "This volume you gave me looks whack!" -- at least AFAICT I'm going to try your earlier suggestion: https://discord.com/channels/747933592273027093/1028039157618069524/1030150002489688144

Mr_NeezerOP•3y ago

Seems like someone back in 2017 did roughly the same thing: see the accepted answer on https://www.digitalocean.com/community/questions/how-to-use-cloud-init-to-mount-block-storage-that-s-already-formatted-and-ready-to-mount

How to use cloud-init to mount block storage that's already formatt...

I’m using Ubuntu 16.04 on a droplet using Terraform. I have an existing volume that’s already been formatted that I would like to mount to /home so I can pe…

Mr_NeezerOP•3y ago

My droplet is Debian 11, not Ubuntu, but possible its a related issue. @kyle Performing the mount manually in runcmd worked:

runcmd:
  - mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username}
  - echo '/dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username} ext4 defaults,uid=1000,gid=1000,nofail,discard 0 0' | tee -a /etc/fstab

runcmd:
  - mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username}
  - echo '/dev/disk/by-id/scsi-0DO_Volume_${home_volume_name} /home/${username} ext4 defaults,uid=1000,gid=1000,nofail,discard 0 0' | tee -a /etc/fstab

Tested through a full stop/start cycle, and no errors in cloud-init log, I see the mount in mount output, and files I save to my home directory now persist. Not sure why the mounts module wasn't working as expected; this is the second cloud-init module that doesn't seem to work as documented (I also had issues with the ansible module). Not leaving a favorable impression of cloud-init, but I'm happy the issue is resolved. Thanks again for all your help!

kyle•3y ago

Absolutely wonderful!

Mr_NeezerOP•3y ago

@kyle Weeeellll, may have spoken a bit too soon. The above does work as I described, but I'm noticing now that when I stop my workspace, my old droplet is destroyed (good) but a new droplet is created (wha??) From the very end of the logs for a "stop":

Apply complete! Resources: 4 added, 1 changed, 4 destroyed.

Looks like it basically just cycled the workspace instead of actually stopping it. Ahh, ok. I had removed all the references to count... turns out those are needed, otherwise the droplet is still listed in the project resource list. Adding that back in and the droplet doesn't get re-created when stopping. Still not 💯 sure I understand what count is though; could you explain?

kyle•3y ago

count determines whether the resource is alive when we perform a stop. It's kinda weird, because start/stop is a Coder-specific primitive, not something in Terraform. Essentially, we provide a helper of data.coder_workspace.<name>.start_count to apply on resources that will only be 1 when the transition is start. This is to allow some resources to destroy on stop for cost savings.

Mr_NeezerOP•3y ago

Thanks for the clarification, @kyle . I've a related question: I'm trying to configure a db firewall resource (https://registry.terraform.io/providers/digitalocean/digitalocean/latest/docs/resources/database_firewall) that uses the ID of the droplet created:

  rule {
    type  = "droplet"
    value = digitalocean_droplet.workspace[0].id
  }

  rule {
    type  = "droplet"
    value = digitalocean_droplet.workspace[0].id
  }

This consistently gives me this error on template push:

Error: Invalid index The given key does not identify an element in this collection value: the collection has no elements.

I've tried adding depends_on = [digitalocean_droplet.workspace] but that doesn't seem to make a difference. This feels related to the count logic from above... is it?

Terraform Registry

kyle•3y ago

It is! You'll need to add the count property to the database_firewall resource as well.

Mr_NeezerOP•3y ago

Do I need to change the value in my snippet above, or is just adding the count property sufficient? Nm, I don't. Things look like they're working as-expected now. Thanks @kyle !

Gaming

Programming

Prevent destroying and recreating home volume on workspace update

Did you find this page helpful?