Amazon Web Services Elastic Cloud Computing (a.k.a. EC2) is a service that lets anyone with a credit card rent a virtualized server from Amazon. To cater to different clients' needs, AWS provides various instance types that are either general instance or specific-purpose instances (focused on CPU, RAM, IO). You can see the different types in Fig 1. This blog post is going to talk about a storage optimized instance. the I3 instance type family, its little-known problem, and the solution in the form of Elastic Block Storage (a.k.a. EBS).
The AWS EC2 I3 Family
The I3 instance family provides storage optimized instances designed for high transactions, low latency workloads, and high IOPS. I3 is built with Non-Volatile Memory Express (NVMe) SSDs to achieve these goals. If you get confused by NVME, this means that the drive will not lose data while the power is out. Overall, I3 instance family's main purpose is a storage intensive application at a low cost. The following matrix provides the different bundled resources (Fig #2) available in the family.
The I3 instance family looks solid, so what is the problem? While NVMe SSD storage is persistent and would NOT lose data due to power outage, it's not completely persistent due to it's AWS provisioning (it's actually, a volatile temporary ephemeral storage). This instance storage will in fact lose data if AWS EC2 instance is stopped. AWS I3 family storage is persistent as long the instance it's bound to is up and running.
Why would you stop AWS EC2 instance you might ask? It's very common to stop an instance to save money or to downgrade/upgrade an instance. What is not very common is to lose all your data due to saving money and downgrade/upgrade. Stop an instance and you will effectively lose an access to that I3 drive, which is akin to loss of data. Now, if that drive was used for some temporary storage or it’s data is quickly “replenish-able”, you are fine. At the same time, if you were storing some permanent and unique data, you are in a big-big trouble.
The solution is easy. If you can't afford to lose any data on an I3 instance storage, don't use it. Use a different EC2 instance type or use dedicated EBS volume network attached storage for data that you can't afford to lose. You will have no problems with stopping/restarting other EC2 instance types with the data going back online on EBS volumes. Don’t use I3 attached storage unless this is part of the plan, complement it with EBS volumes, or be prepared to lose data. Be aware of the I3 storage that you are using – this storage is not persistent, it's only.
It would be nice if AWS could add some appropriate wording, such as “This storage is not completely persistent, use it at your own risk!”. In my humble opinion, I3 is not super useful for a regular database applications. At the same time, it will probably do just fine for some NoSQL scale-out databases.
This post is partially based on the following resources: