Managing files over WebDAV with Terraform

A few weeks ago, I participated in a conversation on social media about the use of simple technologies like WebDAV instead of relying on modern and unreliable stuff like minio.

WebDAV is boring and it is not gonna change its license or become obsolete because some shareholders decided they want more, so everybody else is gonna have less. WebDAV is an Internet Standard implemented by Apache, nginx, Caddy and many others.

Unfortunately, modern DevOps tools prefer to play with modern tech. It is easy to publish a blob on a S3-like platform with infrastructure as code tools like Terraform, but you are out of luck if you want to push that same blob on a WebDAV server. Or at least, you WERE out of luck because I decided to follow through and develop a WebDAV Terraform provider on my free time! Hooray!

OpenTofu is not supported at the moment because I used features that are not yet in an OpenTofu stable version. As soon as v1.11 is out, I’ll seek compatibility.

Some of you may be wondering about these missing features. In short, OpenTofu < 1.11 misses ephemeral resources and write-only attributes. Those are very important features that enable secure-ish1 secret handling in Terraform/OpenTofu. And because developing a Terraform provider with write-only attributes is a challenge, I decided to document the journey in this blog post.

But first, let’s talk about my use cases.

What Documents Could I Possibly Publish over WebDAV ?

WebDAV main current use case is CalDAV. Most people using an online calendar are probably synchronizing their events over WebDAV without ever knowing it.

In the infrastructure that I have developed, I like to use WebDAV because it relies on mTLS for authentication. Transferring files over SSH/SFTP is fine as long as you are able to authenticate the SSH server. Unfortunately, this is rarely done. Most Terraform providers I audited are silently skipping the host key verification, allowing free meddle-in-the-middle (MITM) attacks and making the connection as secure as using Telnet… This is why I developed the SSH2VSOCK Terraform provider, which leverages the implicit trust you have in the hypervisor to access a guest VM over SSH without needing to verify the SSH host key. Unfortunately, this provider only works when using a (compatible) hypervisor. When using a bare metal server or an incompatible hypervisor, SSH host key verification remains a challenge.

Meanwhile, secure TLS certificate deployment to authenticate servers (and clients) is trivial thanks to the ACME protocol. Once the certificates are provisioned, tools like Terraform can safely connect to a WebDAV server over HTTPS, authenticate using their client certificate, and manage file over a secure channel.

Using this secure channel, I can push configuration files like ignition configuration that are later served over HTTPS to Fedora CoreOS booting instances. I can also push shared secrets to several instances and avoid using centralized online secret managers, which carry their own risks. This is a topic I discussed at length in a podcast episode (in French, but the transcript is available if you want to use a translator on it).

Secret Handling in Terraform

Handling secret in Terraform used to be a very bad practice. Terraform is a state machine, and a state machine is based on… you guessed it: a state. And that state is stored in clear text2. OpenTofu implemented client-side encryption, using a passphrase, but the idea is cursed because it means anybody knowing the password can learn all secrets stored in the state. If you want to fragment the knowledge, you have to fragment the workspace as well.

Hashicorp (employees and community members) came up with a really smart solution to this issue: the ephemeral resources and the write-only attributes.

Ephemeral resources have a different lifecycle from standard resources. They are created (opened) with every execution (plan or apply) and destroyed (closed) immediately after the execution. Thus, they are not persisted in the state. Write-only attributes can be assigned ephemeral values generated by the ephemeral resources. These attributes are NOT stored in state. Putting all of this together means you can generate a password with an ephemeral resource and assign it to a write-only attribute, without it ever getting stored in the state.

As such, Terraform can become a secret generator and distribution center without ever becoming a single point of failure.

Developing a Terraform resource with some write-only attributes

Write-only attributes are weird.

When developing a resource, one implements the usual CRUD operations: Create, Read, Update and Destroy. Which of these functions is called depends on the plan. The called function receives the planned values as arguments. It then proceeds in making the intended changes. It is pretty straight forward.

With write-only attributes, things change: the plan only contains null values for the write-only attributes. This is due to the fact that write-only attributes can receive ephemeral values. These values can be different with every invocation of Terraform. Building a plan containing values for these attributes would not make sense, since these values might be different during the apply phase.

To implement CRUD operations on resources containing write-only resources, you will need to get values both from the plan and from the configuration.

Now the problem is that your resources won’t have any changes during the planning phase if the only thing that changes is a value assigned to a write-only attribute. The reason for this is that your plan is computed by comparing the state and the planned changes. Except your planned changes will contain a null value for your write-only attributes, and so will the state these write-only attributes…. So since nothing has changed, there is nothing to do. Thank you very much.

Don’t even think about adding a standard plan modifier like RequireReplace(): the value did not change! It is still null, so there is nothing to replace!

Terraform documentation contains some “best practices” to implement write-only attributes. The main advice is to pair up your write-only attributes with some standard read-write attributes (like a version number or a keeper), or use some resource private state.

Keepers and version numbers are exposed to the practitioners, which effectively means that we transfer on them the burden of deciding when to replace the resource.

So I first attempted to rely exclusively on a private state and it somewhat worked until it did not.

Using a private state

First off, let’s clarify that the private state is not private in the sense that it is confidential. The private state is still part of the state that is sent in clear text to the state storage backend. Private only means that the content is not spilled in front of the practitioners’ eyes and they cannot modify it directly. As such, storing the content of a write-only attribute configuration value in the private state effectively negates the very thing that makes write-only attributes usable as a security feature. If you are a security auditor, you should check that the developers did not do that…

The private state can be used to store a value derived from the write-only value. For instance, a cryptographic digest of the value assigned to the write-only attribute. When a new value is assigned, you can implement a custom plan modification function that will hash the new value and compare it with the digest stored in the private state. If the digest changes, it means the resource was updated and the resource needs to be updated/replaced.

This may be acceptable, considering a cryptographic hash function is a one-way operation and learning the content of the private state is not supposed to yield information on the hashed data.

And if you thought I was making any sense: gotcha. Yes, a cryptographic hash function is a one-way operation, but it is a deterministic one. As such, if a value is assigned to two resource instances, one won’t be able to know the assigned value just by looking at the state, but one will be able to learn that it was the same value that was assigned to those two resource instances. That may be an undesired property. Also, if the assigned value is a low-to-medium entropy password, one could try and brute force the password value from the digest.

As a consequence there is no universal answer here: depending on the content that may be assigned to the write-only attribute, you may want to hash the value with a “simple” cryptographic hash function, like SHA-512, or with a function that was built to derive passwords, like Argon2ID.

In the case of the WebDAV provider I developed, I chose to expose to the practitioners a write-only attribute for the content of the managed files. Doing so helps prevent the state from being bloated with large file contents. It also helps prevent the storage of sensitive values in the state if the file contains stuff like passwords.

To detect changes, I stored a digest in the private state and implemented a custom plan modification function. The practitioners can inform the provider if the content of the file contains sensitive values by specifying a hash salt. If no salt is not specified, the stored digest is computed with SHA-512. If a salt is specified, a SHA-512 digest is computed over the file content; it is then hex-encoded and used as input for the Argon2ID hash function with the salt value. Using this “weird” combination of SHA-512 + Argon2ID is just a way to avoid having to pass the whole file content to the Argon2ID function. This should not affect security, even if you believe hash shucking gives you any advantage.

As such, practitioners using my WebDAV provider can write stuff like:

terraform {
  required_providers {
    random = {
       source = "hashicorp/random"
    }
    remotefs = {
      source = "X-Cli/remotefs"
    }
  }
}

variable "mysecret" {
  type = string
  ephemeral = true
}

provider "remotefs" {
  webdav = {
    base_url = "YOUR_WEBDAV_SERVER_URL"
  }
}

resource "random_pet" "my-id" {
  length = 4
}

# Creates a /my-id file containing the random petname that was generated above
resource "remotefs_file" "davsecret" {
  path = "/my-id"
  inline_content = random_pet.my-id.id
}

resource "random_bytes" "salt" {
  length = 16
}

# Creates a /secret.txt file containing the secret specified as an ephemeral variable
resource "remotefs_file" "davsecret" {
  path = "/secret.txt"
  inline_content = var.mysecret
  hash_salt = random_bytes.salt.hex 
}

This ought to work… as long as you provide the same ephemeral value during the plan and the apply phases.

But as soon as you try to do the following, the provider will explode in flight:

terraform {
  required_providers {
    random = {
       source = "hashicorp/random"
    }
    remotefs = {
      source = "X-Cli/remotefs"
    }
  }
}

provider "remotefs" {
  webdav = {
    base_url = "YOUR_WEBDAV_SERVER_URL"
  }
}

ephemeral "random_password" "mysecret" {
  length = 16
}

resource "remotefs_file" "mysecret" {
  path = "/mysecret"
  inline_content = ephemeral.random_password.mysecret.result
}

This is because the ephemeral random_password resource will generate a different value during the plan phase, the apply phase and the post-apply non-refresh plan phase that ensure that the apply phase went as expected. So the solution exposed in this section of the blog post only works for values that are not generated using an ephemeral resource. That may be OK for you but I would love to be able to generate random passwords like in the previous example. So we need another mechanism!

Using keepers

Keepers are a notion introduced by the hashicorp/random provider. They consist of a map of values that have no ties with the business logic of the managed resource. Instead, the keepers are in control of the lifecycle of the managed resource: if the keeper map values change, the resource must be replaced.

Using keeper values, one can ignore the digest difference between the private state and the assigned value and instead rely entirely on the keeper value changes.

To support both workflows, I modified my custom plan modification function to skip the digest computation and comparison operations if the keeper attribute is not null, and I added a keeper attribute that has a plan modifier requiring resource replacement if it is updated. During the resource replacement, whatever the value that will be yielded by the ephemeral random_password resource during the apply phase will be used.

Here is a working example:

terraform {
  required_providers {
    random = {
       source = "hashicorp/random"
    }
    remotefs = {
      source = "X-Cli/remotefs"
    }
  }
}

variable "secret_version" {
  description = "Increase this value when the secrets need to be renewed"
  type = number
  default = 0
}

provider "remotefs" {
  webdav = {
    base_url = "YOUR_WEBDAV_SERVER_URL"
  }
}

ephemeral "random_password" "mysecret" {
  length = 16
}

resource "remotefs_file" "mysecret" {
  keepers = {
    secret_version = var.secret_version
  }
  path = "/mysecret"
  inline_content = ephemeral.random_password.mysecret.result
}

Conclusion

Write-only attributes are weird to implement because they require extra logic and stuff you probably never used before in the Terraform plugin framework. Nevertheless they are an essential part of the ongoing revolution that is secret management as part of the Terraform configuration.

Old farts like me, who enjoy publishing content thanks to reliable Internet standard protocols, can now do so with Terraform.

In the following weeks, I expect to publish an update for that provider to support the SFTP protocol in case you already have the material to check the SSH host key (known_hosts file or SSHFP DNS records).


  1. Go has no efficient and universal way of zeroizing freed memory. So you need to rely on the kernel to do it for you. ↩︎

  2. Server-side state encryption is just sending your state in clear text to a service provider and hoping that they will encrypt the state as they said they would. Also, let’s hope there are no early TLS channel terminations, like Cloudflare does↩︎