build 393: adding server results in missing data

Justin Sheehy justin at basho.com
Tue Nov 17 16:22:33 EST 2009


Riak users,

A problem was discovered in the mechanism that was used to provide
"filtered preflists" of virtual nodes for storage.  The goal of this
mechanism was to ensure that as many replicas of each document as
possible are placed on separate physical nodes.  The dynamic nature of
the filtering turned out to cause problems for specific cases,
especially when adding nodes to small clusters while using large R-values.

The problem was previously undetected because almost all production
and test cases fell into the category where the cluster was rarely if
ever grown by more than X nodes at a time, where X = N - R.

Upon examination we determined that the best and safest path was to
remove the dynamic nature of filtered preflists and instead ensure
that the partition ownership within the ring maintained the maximal
spread innately, via the static position of owning nodes.  This means
that more work is done at the time that nodes enter and join the ring,
but much less work is done in the much more frequent case of issuing a
request that must reach multiple replicas.

This should improve Riak behavior and performance for all users as
well as removing the known bug, but it creates an unfortunate backward
incompatibility with old clusters.  We are working now with the known
users of Riak in production to cleanly update their systems without
any application-visible downtime.

If you want the easiest path to fix an existing cluster and can
tolerate a short amount of downtime, we recommend the following method:

1. On your old cluster, run:
   ./start-backup.sh <node-in-old-cluster> <old-cluster-cookie> <filename>
2. Set up a brand new cluster with riak-0.6.2 or later.
3. On your new cluster, run:
   ./start-restore.sh <node-in-new-cluster> <new-cluster-cookie> <filename>

There are other means to switch to the new code that require less
interruption, but the above should be the simplest for the typical user.

We apologize for any inconvenience this change may have caused.

- the Riak core team




More information about the riak-users mailing list