Post commit hooks are a single process, so they are executed in the same order as the commits ?

Simon Majou simon at
Tue Apr 9 06:36:34 EDT 2013


Yes I know about Riak Enterprise, but I am thinking about a more
"fine-grained" multi-cluster, by setting a X-Meta-primary-cluster property
on each value.

In that scenario, the object would be written always on the primary
cluster, and read on both clusters.
In case of failure all the writes would be done on the active cluster, with
the primary-cluster updated to the new primary.

>From your explanation on the post commit hook I understand that I will need
a message queue between the cluster, in order to free the hook as fast as
possible, and to not lose any updates in case of crash of one cluster.


On Tue, Apr 9, 2013 at 3:55 AM, Ryan Zezeski <rzezeski at> wrote:

> Simon,
> On Mon, Apr 8, 2013 at 7:14 PM, Simon Majou <simon at> wrote:
>> Hello,
>> I want to sync a bucket of a first cluster with the bucket of a second
>> cluster. To do that I think using the post commit hook.
> If you didn't know, this is exactly what Riak Enterprise was built to do.
>  I.e. handle multi-cluster replication.  However, if you want to give it a
> go on your own a post-commit hook is one way to get the job done.  You'll
> want to think through failure scenarios where the receiving cluster is down
> and how to deal with msgs that are dropped between clusters.  The
> post-commit hook runs on a process called the "coordinator", there is a
> coordinator for every incoming request.  So you won't block the vnodes,
> which is important, but the client/user request will block until your
> post-commit returns.
>> Is there any risk that the sequence of PUTs to be mixed in such a
>> scenario ?
> Do you mean the sequence seen on cluster A vs. cluster B?  Are you asking
> if the object could appear to be on B before A even though the PUT was sent
> to A?  The answer is, it depends.  With a healthy system it's probably
> unlikely but it will depend on your DW values and state of each cluster.
>  E.g. if cluster A nodes get slow disk I/O then perhaps the replication to
> cluster B could beat writes on A.  If we start introducing node and network
> failures, or changing W/DW values then things can get more complicated.
>  You could have success on cluster A, fire replica to cluster B, all
> primary nodes for that object on cluster A die, now cluster B will have a
> key for which cluster A says not_found (well, not totally true, depends on
> your PR value).
> -Z
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list