Clarifying "Read-before-Write"

Andres Jaan Tack andres.jaan.tack at
Sat Nov 26 08:19:40 EST 2011

Thanks! That explanation is perfect. I guess should have taken a look at
some of the other clients as an example in the first place.

Now I have something to fix for Riak-Cpp. :)


2011/11/26 Russell Brown <russelldb at>

> On 26 Nov 2011, at 01:14, Andres Jaan Tack wrote:
> So I was just reading and thinking about this, and I don't understand the
> advice offered under "Read-before-Write" at
> "Riak will return an encoded vector clock<>
>>  with every "fetch" or "read" request that does not result in a "not
>> found" response. In addition to the Client ID, this vector clock tells Riak
>> how to resolve concurrent writes, essentially representing the "last seen"
>> version of the object to which the client made modifications. In order to
>> prevent sibling explosion<>,
>> clients should always have a vector clock before sending a write, and send
>> the vector clock as part of the write request. Therefore, it is essential
>> that keys are fetched before being written (except in the case where Riak
>> selects the key or there is *a priori* knowledge that the key is new).
>> Client libraries that make this automatic will reduce operational issues by
>> limiting sibling explosion. Clients may also choose to perform automatic Sibling
>> Resolution<>
>>  on read."
> I'm having trouble understanding the advice. I get that if I'm aware of
> all the siblings, I can resolve them (optionally) with that vector clock.
> What I don't understand here: If an application PUTs to an object out of
> the blue, not having read it first, should the client library
> read-before-write?
> Yes it should.
> This seems like a great way to blow away siblings by accident.
> But it should never do that, if siblings are encountered, it should *do*
> something.
> Or is the point rather to avoid sibling explosion for applications that
> don't care about losing information?
> A well behaved client library will not blindly PUT a value "over the top"
> of siblings, but will push the problem to the library user (hopefully in
> some helpful way, like automatically applying some domain specific
> resolution logic.)
> So, in the case of the Java client, when you store (or fetch for that
> matter) you must provide an implementation of the ConflictResolver<T>
> interface to the client, this will then be executed to resolve any siblings
> on the pre-store fetch. If you don't provide a conflict resolver the Java
> client uses one that throws a runtime exception when it encounters siblings
> on fetch, exactly so that you don't do as you describe, and blow away
> potentially meaningful sibling values.
> Maybe the wording on the wiki should make this clearer, maybe it should
> read:
> "Clients [that automatically fetch before store] _must_ chose to either
> perform automatic Sibling Resolution *or* abort the write and notify the
> presence of siblings to the caller"
> It is a thorny issue, please let me know if I've answered your question
> adequately.
> Cheers
> Russell
> --
> Andres
> _______________________________________________
> riak-users mailing list
> riak-users at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list