How to use terraform.io to change the image of a stateful server without downtime or data loss?
There's no simple answer to this question.
Using an architecture designed around images (commonly referred to as "immutable infrastructure") works fantastically for stateless services, like your application servers.
It's most definitely possible to expand that to your stateful services with the right tools, failover systems and upgrade paths, but those are usually overkill for simple systems (as the one you describe).
One thing to keep in mind when using these tools is that you don't have to go "all in". Packer and Terraform are very much designed to work only where you want them. They don't enforce a pattern across all of your systems on purpose.
Practically speaking, the best way to handle this problem is maintain your database servers differently, outside of Packer (building the initial image, yes! But not necessarily upgrading them in the same way as the stateless web servers) or outsource managing the state to someone else. Notable options include Heroku Postgres or AWS RDS.
To round it out – yes, it's possible, but with our current tooling it's probably more trouble then it's worth at smaller scale or with simple architectures.
Packer and Terraform can still be a huge boon in other aspects of the very same infrastructure – Terraform, for example, could provision a Heroku database for use in your DigitalOcean Application servers in a very straight forward way. Packer can handle upgrading and releasing your application server images, and likewise for development.
I think Terraform now has the features required here. The basic pattern is to define your data volumes separately and attach these to the instances, so that when the instance is destroyed and a new one created (e.g. from a new AMI built by Packer) the existing volume can be attached to the new instance.
So the detailed steps with Terraform would be:
- define
aws_ebs_volume
resources - attach these to your instances with
aws_volume_attachment
(here I've useddevice_name = "/dev/xvdh"
) - ensure your
aws_ebs_volume
resource include the lifecycle ruleprevent_destroy = true
(so they will never be deleted by terraform) - ensure your
aws_volume_attachment
resource includesskip_destroy = true
(on upgrade terraform would fail to destroy these whilst the volume is mounted, and stopping the instances will destroy the attachment anyway so no need for terraform to attempt it)
The final step is to ensure that the instance mounts the volume on startup. Which can be achieved with the following in the user_data
of your aws_instance
resource:
#!/bin/bash
mkdir /data #create mount point
mount /dev/xvdh /data #mount it
For the above to work you'll need to prepare the volume by creating a filesystem, but this is only required once:
mkfs -t ext4 /dev/xvdh
More details are available in Terraform issue #2740.