Shared File Systems between multiple AWS EC2 instances

This is a use case that has been sought after for quite a while in AWS. As is described in this thread, two common ways to accomplish this was to use S3 or NFS to share data access between instances.

On April 9th 2015, Amazon announced Amazon Elastic File System (Amazon EFS), which provides what you are asking for in your diagram.


It is not possible to share a single EBS volume between multiple EC2 instances.

Your diagram is offloading the data to a shared server. However, this shared server is simply another single-point-of-failure. So you're not saving yourself anything: if the AZ of that server goes down, then you've lost the data, even if the web server/VisualSVN server in another AZ is still running.

You should split your server between it's two separate functions into two separate servers/clusters so they can be handled independently of each other:

  1. web server, and
  2. VisualSVN server

For the web server, do you really need to mirror the volume in a multi-instance scenario, or can you keep your instances anytime-terminatable without data loss? Ideally, you would not save any data locally to the instance. Instead, you would save all data off-server to a database or to Amazon S3. This way, the data is available to all instances, all the time. If the server is lost, none of the data is. Make your "master" AMI and create all instances in an auto-scaling group from that master AMI. When your web server code changes, deploy a new AMI, terminate the old instances and create new ones from the new AMI.

For the VisualSVN server, the question to ask is whether VisualSVN can handle volume data changing on it without the running process caring about it. For example, if the running process caches some data in RAM, what happens if some hard drive sync process comes along behind it's back and changes the hard drive on it? It could be that the VisualSVN server simply is not able to handle a multi-instance scenario. Depending on the answer to that, you may not be able to cluster the VisualSVN server. It's possible that VisualSVN server has it's own clustering feature. If so, then you should investigate that.


I believe the standard Windows Server approach to your architecture problem would be to implement the "Shared File Systems" block in your diagram with a Windows Shared Storage Space that is clustered, spreading the nodes across at least two availability zones.

This theoretically gets you something that looks to the applications like a standard file share but which has no single point of failure. This might be overkill for your situation, but if you're going for fault tolerance this will help protect against additional failure scenarios.

While not specific to your problem, this article on Windows Server Failover Clustering may help give a sense of the scope of the effort: http://aws.amazon.com/microsoft/whitepapers/microsoft-wsfc-sql-alwayson/