This sure looks like a bug...?
btilly at gmail.com
Mon Apr 18 21:12:52 EDT 2011
Riak's small_vclock, big_vclock, young_vclock, and old_vclock
parameters already give control over pruning behavior. If there isn't
enough history to compute a common ancestor, then return nothing for
the common ancestor.
The use case here really isn't an SCM. The use case is when two
clients get simultaneous (within, say, 50 ms) requests to write to the
same object. When a third one tries to read the data 5s later, it
would be nice to have a way to figure out what to do. For this use
case you can limit the amount of history quite severely without loss.
Let's take a practical example of conflicting data structures:
"name": "Jane Doe",
"name": "Jane Blow",
"husband": "Joe Blow",
What should it be resolved to? Perhaps Jane just got divorced and
went to work as a secretary. Or she could have gotten married and
left her job. If you give me the common ancestor I can tell which
scenario to believe. Without it I can only guess badly. I don't want
to keep a history here. I want to resolve the discrepancy the next
time I see it (and log it somewhere important if I can't resolve it).
On Mon, Apr 18, 2011 at 5:38 PM, Sean Cribbs <sean at basho.com> wrote:
> Yes, but vector clocks are for resolution of race-conditions and network partitions, not to provide an SCM history. Imagine how much space would be consumed by the history long enough to disambiguate an object that has been updated normally 1000 times, followed by one bad client that decides write to it without fetching the vector clock first.
> Coda Hale put it well in his talk at the recent Riak Meetup: your data needs to be logically monotonic so that writes (and reads) can be retried until resolution is reached.
> Also, we've found that assigning the client id to something that is relevant to your domain, e.g. real people, will help reduce surprises (and degenerate cases like sibling explosion) when it comes to vector-clock resolution.
> Sean Cribbs <sean at basho.com>
> Developer Advocate
> Basho Technologies, Inc.
> On Apr 18, 2011, at 8:15 PM, Aphyr wrote:
>>> I actually had a question about that page. Why is it that when there
>>> is a conflict we can only get the conflicting versions of the data?
>>> If I'm going to try to resolve the conflict intelligently, I really
>>> want the common ancestor as well so that I can try to do a 3-way
>> Good call. If an ancestor were available it would make counting and merging orthogonal changes *much* simpler.
>> riak-users mailing list
>> riak-users at lists.basho.com
More information about the riak-users