Questions about Riak Enterprise
ahmed.m.bashir at gmail.com
Thu May 10 14:09:36 EDT 2012
Can you elaborate on how EDS replication does this mirroring? Does
each vnode have the ability to connect to the other cluster, or is
there a coordinator that sends data to the other cluster, etc?
On Mon, May 7, 2012 at 6:25 PM, Andrew Thompson <andrew at hijacked.us> wrote:
> On Tue, May 01, 2012 at 03:49:02PM -0400, Mark Rose wrote:
>> I've got some questions about Riak Enterprise I haven't been able to find
>> the answers to.
> Hi Mark, I'm the riak EDS 'maintainer'. Sorry I didn't reply earlier, I
> was travelling all week.
>> I understand that the open source version of Riak's replication is designed
>> for single data center usage only, but I'm unsure about how Riak Entreprise
>> handles replication. Specifically, I'm curious about locality and high
>> Our setup is already running in multiple availability zones on EC2. We're
>> running Galera across the zones to provide both redundancy and a local copy
>> of the data to avoid the network latency of going to another zone. However,
>> Galera, as nice as it is, doesn't scale writes. We're going to be using
>> Riak to store a lot of information going forward, and may eventually move
>> our existing data to it as well.
>> The only thing holding us back from going to multiple regions on Amazon is
>> our datastore.
>> How well does Riak handle layered topologies, such as EC2?
>> Is it possible to configure Riak Enterprise to store two copies of the data
>> in each EC2 region, ensuring that the two copies are in different zones
>> when there are more than one Riak servers in a zone?
> Current EDS replication is pretty simple, it will just try to
> (eventually) ensure that data on one cluster is mirrored on another. It
> won't forward reads and riak doesn't have anything like 'rack
> awareness', at least not yet.
>> When a query is run, is it run in one region only? Would Riak prefer copies
>> of the data in the local zone?
> Riak only queries the local cluster, yes.
>> For what it's worth, our current datastore load is roughly half and half
>> writes and reads. We heavily cache reads with memcache (99%). We may drop
>> memcache if reads on Riak prove fast enough (thus avoiding the issues of
>> invalidating remote caches).
> Given the current limitations, you'd probably be best off with N
> clusters in different regions and/or zones. Don't try to span a single
> cluster across a zone, or even worse, a region. Then hook them together
> with replication.
> There's also some fun with NAT on EC2, but it can be made to work.
> Let me know if that helps,
> riak-users mailing list
> riak-users at lists.basho.com
More information about the riak-users