Problem with Vector Clocks - inconsistencies encountered in cluster with shifted real local clocks

Zuzana Zatrochova zatrochova at
Thu Oct 1 10:45:42 EDT 2015

Thank you for fast reply,

Could you please specify what do you mean by context sent by client? Do you
mean update on the existing object in database?

I see exactly that when allow_mult=false, only the highest timestamp value
is stored.

For me the results are unexpected because the client sees inconsistent
values (not from the last write) but there are no partitions and quorum is
set to the strongest consistency configurations. In the diagram, it is
showed more clearly how shifted clocks generate inconsistent result.


On 1 October 2015 at 14:18, Russell Brown <russell.brown at> wrote:

> I need more time to examine the diagram, but this all looks as expected so
> far.
> If a client sends no context then it’s write will be a sibling of whatever
> is stored at the coordinator, as you rightly point out riak treats an
> incoming clock that is less than a local clock as a sibling.
> If the coordinator is configured to not store siblings then the sibling
> value with the highest timestamp is stored, I recommend you run riak in
> either allow_mult=true or LWW=true, allow_mult=false, in my view, should
> not be default.
> If two riak nodes do the above, and then replicate their values, the
> single value with the highest value is stored. Isn’t this what you are
> seeing? If you depend on time to pick the latest, and nodes’ clocks are out
> of sync this is the price.
> Is this what you are seeing? Are you seeing results you didn’t expect, or
> non-deterministic results? Or both?
> Regards
> Russell
> On 1 Oct 2015, at 12:58, Zuzana Zatrochova <zatrochova at> wrote:
> > Hi,
> >
> >
> >
> > We are researching the client-centric consistency features of Riak
> database. We encountered a problem with vector clocks implementation. The
> vector clocks do not seem to work locally on a machine as expected. We
> would like you to confirm if the behavior is desired. First I will describe
> the environment of our experiments and then the problem will be presented.
> >
> >
> >
> > Environment:
> >
> >
> >       • Our environment consists of six virtual machines
> >               • five machines in Riak cluster, each represent a single
> Riak node with Riak database
> >               • one machine with java application that simulates
> multiple clients communicating with Riak database
> >       • Machines are Virtualized VMs by VMware software and have
> slightly shifted time to each other (no more than 1 second)
> >       • We made experiments with versions riak-1.4.8 and riak-2.1.1. In
> riak-1.4.8 app_config contains vnode_vclocks = true  (default setting that
> was there when downloaded) in riak-2.1.1 we could not locate configuration
> for vnode vclocks either in advanced configurations in documentation or
> riak.conf so we assumed it also defaults to true and is no longer enabled
> to change
> >       • For each experiment we have 500 clients concurrently sending
> requests to random node from the cluster. There are 20000 requests per
> minute operating only on 20 different keys (load on single key is 16
> requests per second (read:write ration = 50:50).
> >       • For referenced issue we used quorums R = 1, W = 3; R = 2, W = 2
> and R =3 W = 1
> >       • All riak settings are default apart from IP settings and quorum
> settings. We added interceptors from riak_test module that don’t change the
> code and are implemented only for logging purposes (information about
> states of nodes), error.log is empty
> >
> > Problem:
> >
> >
> >       • It seems that Riak does not use vector clocks locally, only on
> global scale. When a data object is created on client side and sent to Riak
> database it does not have any vector clocks assigned (more precisely the
> function riak_object:vclock(UpdObj) = [] and local object:
> riak_object:vclock(LocalObj) returns the local VC for the local object.
> Therefore the function (in 2.1.1 but similar behavior is in 1.4.8)
> vclock:descends(NewObject, LocalObject) returns false for all my
> experiments with different quorums (Empty vector clocks cannot descend non
> empty vector clocks). The behavior leads to merge of contents = creation of
> siblings (or resolving the value according to the timestamp not vector
> clocks when siblings are not allowed – our configuration)
> >       • In our experiments when time on VMs is not synchronized up to
> 500 milliseconds the situation from picture issue.png sent in attachment
> arises. Due to the fact that two objects with the same key are sent to two
> different coordinators and coordinators clocks are shifted the later object
> is assigned earlier timestamp as the object that was sent before. As the
> result of the vector clocks implementation in Riak, the later object is
> lost due to the merge of contents where later timestamp (wrong because of
> local clock shift) is evaluated as the latest.
> >
> > The question:
> >
> >
> >
> > Is this the Riak intended behavior? The problem is that even when quorum
> is set to prefer consistency and there are no partitions in the cluster
> there are still inconsistent requests seen from client perspective = any
> read must return the value of the latest finished write or later unfinished
> write request. (We did not use the strong_consistency feature of riak-2.1.1
> version).
> >
> >
> >
> > Thank you,
> >
> > Zuzana
> >
> > <issue.png>_______________________________________________
> > riak-users mailing list
> > riak-users at
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the riak-users mailing list