EC2 and RIAK

Mathias Meyer mathias at
Fri Apr 1 09:21:54 EDT 2011

Hi David,

Alexander already gave you a good rundown on EC2 and Riak, but let me add some of my own experiences running databases on EC2 in general. 

The short answer is, Riak is certainly successfully used in production on EC2, so nothing should hold you back from testing a setup on EC2. But there's a whole bunch of things you should keep in mind.

First, it's probably a good idea to avoid using ephemeral storage as persistent storage. Even though it rarely happens, instances can crash on EC2 for any kind of reason, mostly a hardware failure of the underlying host of course.

Cluster compute instances offer especially high CPU power, but what you really want is really fast and reliable storage I/O, persisted for eternity if need be. CC instances are certainly a lot better than any other instance in that terms of general I/O (see [2] for a comparison), but fall prey to similar limitations in terms of network storage I/O as other instance types, see below.

The RAID 0'd ephemeral storage on the cluster compute instances may sound good in theory in terms of performance, but in practice it takes away data durability in case of a single disk failure. One disk fails, and the data on that node is gone. Depending on what kinds of seek your doing, an EBS setup may even turn out to be faster. See [6] and [4] for a comparison and some initial and extended measurements, and [7] for another comparison. But certainly, the cluster compute instance's ephemeral storage can achieve a good amount of throughput, see [5] for some pretty graphs comparing both RAID and non-RAID setups.

As Alexander pointed out, multiple instance failures can make this scenario a real killer, though you end up with the same risks as running on raw iron servers. Both ephemeral storage and EBS don't make the problem of a proper backup disappear. You could e.g. run off ephemeral storage, relying on both Riak's replication and a good backup e.g. to an EBS volume or to S3.

EBS on the other hand is prone to a large variance in network latency, making performance at any point unpredictable and unreliable. Every measurement you take is likely to be different an hour later. This may sound extreme, but it turns out to be a very big issue for databases where there's lots of disk I/O involved to read and write data, as is the case with Riak's Bitcask storage.

You can increase the performance and reliability of EBS by using a RAID of volumes. Preferrably go for a RAID 5 or RAID 10 to add redundancy. There's mixed opinions on whether that's really necessary on EBS, with Amazon keeping the data redundant on their end as well, but in general, it's a good tradeoff between increased performance through striping and increased redundancy through mirroring. [1] has a good summary of when it's better to choose RAID 5 vs. 10.

RAID 0 will obviously bring the best performance, it's certainly a valid setup. We've been running RAID 0 setups with 4 volumes, and got great improvements over a single volume. You're also likely to achieve more throughput on bigger instances with a setup like this. The caveat once again is that one corrupted volume is enough to make a RAID 0 setup unusable.

Another crazy thought is to setup a RAID striping across a bunch of ephemeral drives and EBS volumes, maximizing throughput on both local and network storage. But know what you're getting yourself into with this kind of setup, especially when your write load is a lot heavier then the available network bandwidth can handle, a scenario where your network volumes will never be able to catch up with the local storage.

All that said, EBS I/O sure is reasonably fast, but it depends on your particular use case and performance requirements. It's also worth noting that the I/O capabilities of EBS increase with the instance size. The bigger your instance, the more throughput you'll achieve (see [3]). Bigger instances tend to have better network throughput in general, with cluster compute instances obviously having some of the highest bandwidth available.

All this turns out to be much less of a problem when data can be held in memory very easily, e.g. with Innostore, where you can read and write to/from cache buffers first and then have InnoDB take care of flushing to disk.

Personally, I don't think you're overcomplicating things in regard to multiple availability zones, it's a good idea to do that, when highest availability possible is your goal, as when it's usually just a single availability zone that's affected by increased latency or network timeouts, but as Alexander said, you should think about having cross-datacenter replication in that scenario, as availability zones are data centers located in different physical locations. Usually they're not that far apart, but far enough to increase latency considerably. But as always, it depends on your particular use case.

Now, after all this realtalk, here's the kicker. Riak's way of replicating data can make both scenarios work. When it's ensured that your data is replicated on more than one node, it can work in both ways. You could use both ephemeral storage and be somewhat safe because data will reside on multiple nodes. The same is true for EBS volumes, as potential variances in I/O or even minutes of total unavailabilities (as seen on the recent Reddit outage) can be recovered a lot easier thanks to handoff and read repairs. You can increase the number of replicas (n_val) to increase your tolerance of instance failure, just make sure that n_val is less than the number of nodes in your cluster.

Don't get me wrong, I love EC2 and EBS, being able to spin up servers at any time and to attache more storage to a running instance is extremely powerful, when you can handle the downsides. But if very low latency is what you're looking for, raw iron with lots of memory and SSD as storage device thrown on top is hard to beat.

When in doubt, start with a RAID 0 setup on EBS with 4 volumes, and compare it with a RAID 5 in terms of performance. They're known to give a good enough performance in a lot of cases. If you decide to go with a RAID, be sure to add LVM on top for simpler snapshotting, which will be quite painful if not impossible to get consistent snapshots using just EBS snapshots on a bunch of striped volumes.

Let us know if you have more questions, there's lots of details involved when you're going under the hood, but this should cover the most important bases.

Mathias Meyer
Developer Advocate, Basho Technologies


On Mittwoch, 30. März 2011 at 18:29, David Dawson wrote: 
> I am not sure if this has already been discussed, but I am looking at the feasibility of running RIAK in a EC2 cloud, as we have a requirement that may require us to scale up and down quite considerably on a month by month basis. After some initial testing and investigation we have come to the conclusion that there are 2 solutions although both have their downsides in my opinion:
> 1. Run multiple cluster compute( cc1.4xlarge ) instances ( 23 GB RAM, 10 Gigabit ethernet, 2 x 845 GB disks running RAID 0 )
> 2. Same as above but using EBS as the storage instead of the local disks.
> The problems I see are as follows with solution 1: 
> - A instance failure results in complete loss of data on that machine, as the disks are ephemeral storage ( e.g. they only exist whilst the machine is up ).
> The problems I see are as follows with solution 2:
> - EBS is slower than the local disks and from what I have read is susceptible to latency depending on factors out of your control.
> - There has been a bit of press lately about availability problems with EBS, so we would have to use multiple availability zones although there are only 4 in total and it just seems as though I am over complicating things.
> Has anyone used EC2 and RIAK in production and if so what are their experiences?
> Otherwise has anyone used RackSpace or Joyent? as these are alternatives although the Joyent solution seems very expensive, and what are their experiences?
> Dave
> _______________________________________________
> riak-users mailing list
> riak-users at

More information about the riak-users mailing list