Problem with Vector Clocks - inconsistencies encountered in cluster with shifted real local clocks

Zuzana Zatrochova zatrochova at
Thu Oct 1 07:58:15 EDT 2015


We are researching the client-centric consistency features of Riak
database. We encountered a problem with vector clocks implementation. The
vector clocks do not seem to work locally on a machine as expected. We
would like you to confirm if the behavior is desired. First I will describe
the environment of our experiments and then the problem will be presented.


   - Our environment consists of six virtual machines
   - five machines in Riak cluster, each represent a single Riak node with
      Riak database
      - one machine with java application that simulates multiple clients
      communicating with Riak database
   - Machines are Virtualized VMs by VMware software and have slightly
   shifted time to each other (no more than 1 second)
   - We made experiments with versions riak-1.4.8 and riak-2.1.1. In
   riak-1.4.8 app_config contains vnode_vclocks = true  (default setting
   that was there when downloaded) in riak-2.1.1 we could not locate
   configuration for vnode vclocks either in advanced configurations in
   documentation or riak.conf so we assumed it also defaults to true and is no
   longer enabled to change
   - For each experiment we have 500 clients concurrently sending requests
   to random node from the cluster. There are 20000 requests per minute
   operating only on 20 different keys (load on single key is 16 requests per
   second (read:write ration = 50:50).
   - For referenced issue we used quorums R = 1, W = 3; R = 2, W = 2 and R
   =3 W = 1
   - All riak settings are default apart from IP settings and quorum
   settings. We added interceptors from riak_test module that don’t change the
   code and are implemented only for logging purposes (information about
   states of nodes), error.log is empty


   - It seems that Riak does not use vector clocks locally, only on global
   scale. When a data object is created on client side and sent to Riak
   database it does not have any vector clocks assigned (more precisely the
   function riak_object:vclock(UpdObj) = [] and local object:
   riak_object:vclock(LocalObj) returns the local VC for the local object.
   Therefore the function (in 2.1.1 but similar behavior is in 1.4.8)
   vclock:descends(NewObject, LocalObject) returns false for all my
   experiments with different quorums (Empty vector clocks cannot descend non
   empty vector clocks). The behavior leads to merge of contents = creation of
   siblings (or resolving the value according to the timestamp not vector
   clocks when siblings are not allowed – our configuration)
   - In our experiments when time on VMs is not synchronized up to 500
   milliseconds the situation from picture issue.png sent in attachment
   arises. Due to the fact that two objects with the same key are sent to two
   different coordinators and coordinators clocks are shifted the later object
   is assigned earlier timestamp as the object that was sent before. As the
   result of the vector clocks implementation in Riak, the later object is
   lost due to the merge of contents where later timestamp (wrong because of
   local clock shift) is evaluated as the latest.

The question:

Is this the Riak intended behavior? The problem is that even when quorum is
set to prefer consistency and there are no partitions in the cluster there
are still inconsistent requests seen from client perspective = any read
must return the value of the latest finished write or later unfinished
write request. (We did not use the strong_consistency feature of riak-2.1.1

Thank you,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: issue.png
Type: image/png
Size: 347569 bytes
Desc: not available
URL: <>

More information about the riak-users mailing list