This sure looks like a bug...?

Dan Reverri dan at basho.com
Mon Apr 18 19:59:00 EDT 2011


Hi Bryan,

This is an excellent question and one of the more difficult areas of Riak to
understand. The source of confusion in this situation is vector clocks. Riak
maintains a vector clock for every object which is used to track different
versions of the object and potentially auto-repair out-of-sync data.

I recommend reviewing this page in the wiki:
https://wiki.basho.com/Vector-Clocks.html

I've tried to provide a walk through below that explains the behavior. The
main lesson to take away is you should always provide a client id and vector
clock.

When Riak processes an update for an object it compares the vector clock of
the update to the vector clock of the existing object; based on the vector
clock comparison Riak can determine if the update is a descendant of the
existing value, or a sibling, or a stale value.

For example, if Client1 writes the first version of Object1 the vector clock
is:
[{Client1, 1}]

If Client1 updates the value of Object1 and supplies this vector clock with
the update, the resulting vector clock of the update becomes:
[{Client1, 2}]

This vector clock is a descendant of the first vector clock so Riak chooses
the update as the winning value.

If Client2 writes a value to Object1 with no vector clock the vector clock
on the write becomes:
[{Client2, 1}]

Comparing this vector clock to "[{Client1, 2}]" indicates that the new value
is a sibling of the current value. Object1 now has two siblings.

If Client1 writes a value to Object1 with no vector clock the vector clock
on the write becomes:
[{Client1, 1}]

This vector clock is considered stale since the existing vector clock
already has two updates from Client1. This is the behavior you are seeing.

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
dan at basho.com


On Mon, Apr 18, 2011 at 4:16 PM, Bryan O'Sullivan <bos at mailrank.com> wrote:

> I have an app using the protobuf API that's exhibiting some strange
> behaviour.
>
> I have a bucket with allow_mult = true, so when I perform a PUT I expect
> that if a matching key exists, a sibling will be created, and the response
> to the PUT should contain both the pre-existing value and the new value,
> with different vtags. However, what I'm actually seeing is that the older
> value is returned, and the PUT appears not to have succeeded at all.
> However, this only happens under heavy load. If I poke the server just once
> in a while, everything seems to work fine.
>
> This problem arises in a server-side application that uses connection
> pooling, so when it's under heavy load, requests are always issued with the
> same client ID. On a hint from Kyle Kingsbury, I tried changing the client
> ID every time an existing connection is recycled out of the pool, and that
> seems to have made the problem disappear. However, I find this pretty
> disturbing. What on earth is going on?
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20110418/2f45891c/attachment.html>


More information about the riak-users mailing list