safely resolving conflicts on read

Justin Karneges justin at affinix.com
Wed Nov 2 13:40:44 EDT 2011


Thanks everyone for these replies (and also Aphyr, off-list).  It has helped me 
confirm my suspicions and sounds like I'm on the right track.

For one of my keys, I am doing sort of a manual "last write wins" by having 
the reader sort siblings by timestamp, then by vtag, to deterministically 
select the same sibling every time.  The reason for keeping the other siblings 
around is they may contain the only references to other keys created along 
with them.  A separate cleanup process can then be sure to delete the referred 
keys before removing the siblings.  And of course the algorithm used to 
determine the winning sibling is shared by both the read function and the 
cleanup function.

On Wednesday, November 02, 2011 07:52:03 AM Bob Ippolito wrote:
> An approach similar to #1 is implemented in statebox
> http://github.com/mochi/statebox - basically the trick is to store an
> operation queue along with the data, and to put some constraints on how
> operations must work so that they can be repeated for conflict resolution.
> 
> On Wednesday, November 2, 2011, Erik Søe Sørensen <ess at trifork.com> wrote:
> > What you'd usually do is somewhere between 2) and 3) - namely, accept
> 
> that siblings might occur (although rarely). Also, you'd have a resolution
> function with the property (besides being deterministic) that
> reconciliating two identical siblings would yield the same - i.e., f(X,X) =
> X.
> 
> > ________________________________________
> > Fra: riak-users-bounces at lists.basho.com [
> 
> riak-users-bounces at lists.basho.com] På vegne af Justin Karneges [
> justin at affinix.com]
> 
> > Sendt: 1. november 2011 21:34
> > Til: riak-users at lists.basho.com
> > Emne: safely resolving conflicts on read
> > 
> > Hi,
> > 
> > http://wiki.basho.com/Vector-Clocks.html contains this text:
> > 
> > "It should be noted that if you are trying to resolve conflicts
> 
> automatically,
> 
> > you can end up in a condition with which two clients are simultaneously
> > resolving and creating new conflicts."
> > 
> > If conflict resolution is moved to the reader, I'm curious what
> > strategies people use to avoid this kind of feet stomping.
> > 
> > Some ideas that have come to mind:
> > 
> > 1) Have a deterministic way of deriving a correct/unified/winner value
> 
> from
> 
> > multiple siblings, such that any reading client would always arrive at
> > the same answer.  Simply use this derived answer as the value read, but
> > don't attempt to write a corrected value into Riak.  The value could
> > eventually
> 
> be
> 
> > corrected at the time of a necessary write as opposed to a read reaction.
> > 
> > 2) Determine a correct value per #1 above, but then attempt to write this
> > value back into Riak in such a way that if multiple nodes were to write
> 
> the
> 
> > same value simultaneously then they don't create siblings.  I'm not sure
> 
> if
> 
> > this is possible in Riak?
> > 
> > 3) Determine a correct value per #1 above, and allow exactly one node to
> 
> ever
> 
> > immediately write corrected values after a read.  Something like "if
> 
> hash(key)
> 
> > % node_count == current_node then do_correction".  Since value correction
> 
> is
> 
> > not vital to availability, there's no harm if the owning node is down.  I
> 
> just
> 
> > figure this would allow for self-healing over time.
> > 
> > Maybe there are other ways.  What do people really do?
> > 
> > Thanks,
> > Justin
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




More information about the riak-users mailing list