Post commit hooks are a single process, so they are executed in the same order as the commits ?

Guido Medina guido.medina at
Tue Apr 9 07:01:10 EDT 2013


   We use a similar approach, using 2i and batch numbers (every 
bucket/key we are interested is stamped with a 2i batch number which 
increases once a minute), a Java client that copies to two different 
clusters once a minute also, from "last batch number copied" to "current 
batch number - 1", so it is always a minute behind, like I said it uses 
2i (current batch number is read only for target clusters and read-write 
on the target clusters), and it also copies keys concurrently.

Though we have Riak EE, we felt the same about having a fine grained 
cluster copy tool.

With that tool in place we don't need post-commits nor sync-ed operation 
on the "master cluster", we don't actually call it master cluster, we 
merely use the tool to have backups. There is tracking key that holds 
the batch number value and a list of buckets to operate.

Hope that helps,


On 09/04/13 11:36, Simon Majou wrote:
> Ryan,
> Yes I know about Riak Enterprise, but I am thinking about a more 
> "fine-grained" multi-cluster, by setting a X-Meta-primary-cluster 
> property on each value.
> In that scenario, the object would be written always on the primary 
> cluster, and read on both clusters.
> In case of failure all the writes would be done on the active cluster, 
> with the primary-cluster updated to the new primary.
> From your explanation on the post commit hook I understand that I will 
> need a message queue between the cluster, in order to free the hook as 
> fast as possible, and to not lose any updates in case of crash of one 
> cluster.
> Simon
> On Tue, Apr 9, 2013 at 3:55 AM, Ryan Zezeski <rzezeski at 
> <mailto:rzezeski at>> wrote:
>     Simon,
>     On Mon, Apr 8, 2013 at 7:14 PM, Simon Majou <simon at
>     <mailto:simon at>> wrote:
>         Hello,
>         I want to sync a bucket of a first cluster with the bucket of
>         a second cluster. To do that I think using the post commit hook.
>     If you didn't know, this is exactly what Riak Enterprise was built
>     to do.  I.e. handle multi-cluster replication.  However, if you
>     want to give it a go on your own a post-commit hook is one way to
>     get the job done.  You'll want to think through failure scenarios
>     where the receiving cluster is down and how to deal with msgs that
>     are dropped between clusters.  The post-commit hook runs on a
>     process called the "coordinator", there is a coordinator for every
>     incoming request.  So you won't block the vnodes, which is
>     important, but the client/user request will block until your
>     post-commit returns.
>         Is there any risk that the sequence of PUTs to be mixed in
>         such a scenario ?
>     Do you mean the sequence seen on cluster A vs. cluster B?  Are you
>     asking if the object could appear to be on B before A even though
>     the PUT was sent to A?  The answer is, it depends.  With a healthy
>     system it's probably unlikely but it will depend on your DW values
>     and state of each cluster.  E.g. if cluster A nodes get slow disk
>     I/O then perhaps the replication to cluster B could beat writes on
>     A.  If we start introducing node and network failures, or changing
>     W/DW values then things can get more complicated.  You could have
>     success on cluster A, fire replica to cluster B, all primary nodes
>     for that object on cluster A die, now cluster B will have a key
>     for which cluster A says not_found (well, not totally true,
>     depends on your PR value).
>     -Z
> _______________________________________________
> riak-users mailing list
> riak-users at

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list