Possibility of a CAS API

Martin Bruse zondolfin at gmail.com
Sat Feb 25 02:43:17 EST 2012


Take a look at https://github.com/ProjectDaisy/blodsband#L480 where I have
implemented a CAS-like method on top of Riak. It works by (slightly
simplified):

0) Reading current value
  1) Checking current value
  2) Writing with a "request id", W = all and readbody = true if the
current value is what we expected (vclock is right, or value didn't exist)
  3) OR overwriting siblings with the one with highest "request id" or
"winner" if multiple siblings
  4) OR overwriting value with "winner" if it was the one we wrote and has
no siblings

and looping around that until we have an entry marked "winner".

I have tested it extensively on single node systems, and it seems to work.
What I would like is someone reviewing the algorithm and possibly trying it
on their multi node system :)

On Feb 25, 2012 4:21 AM, "Armon Dadgar" <armon.dadgar at gmail.com> wrote:
>
> It sounds like the "If-Unmodified-Since" and "If-None-Match" flags could
do what I
> need, but the docs specify "it is possible for the condition to evaluate
to true for
> multiple requests if the requests occur at the same time."
>
> From my understanding, the KV vnode's process their requests in a serial
fashion.
> I'm not sure I fully understand how It could be that the request
evaluates to true
> for multiple requests, if the PUTs are handled serially.
>
> If it is a matter of the vnodes being interleaved, would it be solvable by
> setting w = r = n?
>
> I'm not convinced that a CAS operation is inevitably subject to data
races.
> There are proven techniques for avoiding races at the cost of latency,
> which is acceptable in certain situations.
>
> I will take a look at Zab, thanks for the reference!
>
> Best Regards,
>
> Armon Dadgar
>
> On Feb 24, 2012, at 6:09 PM, Dietrich Featherston wrote:
>
>> If you need CAS semantics, then coordinate that outside of riak. Any
check-then-act type of operation where atomicity is important is going to
leave some room for a data race in a system with the distribution semantics
of riak. Would suggest thinking about the problem in such a way that
handling of siblings is tolerant of duplicate writes and eventually the
correct value bubbles up to the readers. That or do the coordination of
unique indexes in something not dynamo shaped.
>>
>> I can't say I'm intimately familiar with the work yet, but others have
prototyped/postulated consistency layers on top of riak (a la zab) that
might more closely match what you're trying to do. None of this is in a
released / supported version of riak to my knowledge though.
>>
>> Thanks,
>> D
>>
>>
>> On Fri, Feb 24, 2012 at 4:41 PM, Armon Dadgar <armon.dadgar at gmail.com>
wrote:
>>>
>>> As part of a new feature we are working on, we've run into
>>> a situation where it would be incredibly convenient to have a
>>> check-and-set (CAS) API for Riak KV. In short, we are trying to build
>>> a unique index of a bucket, using a second bucket which acts as a
>>> reverse index.
>>>
>>> The CAS API would operate in the same manner as a PUT, except it
>>> should take a "last vclock". The new value + last vclock are submitted
>>> to the responsible vnodes. The vnodes respond if the last vclock
>>> for the key matches the specified last value. If we get "r" nodes
responding
>>> that the last value matches, then we should commit the write. This
method
>>> is basically a two-phase commit.
>>>
>>> It would also be great if no-value sentinel could be specified to
indicate
>>> the CAS should only succeed if there is not already a key. We need this
>>> to make sure uniqueness constraints are not violated.
>>>
>>> I wanted to gauge the interest from the community in something like
this,
>>> and see if I could get thoughts from the Basho team on if this could be
>>> implemented.
>>>
>>> Best Regards,
>>>
>>> Armon Dadgar
>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120225/1703e4f7/attachment.html>


More information about the riak-users mailing list