Why does sending X-Riak-ClientId prevent subsequent updates?

Rusty Klophaus rusty at basho.com
Tue Feb 9 11:04:12 EST 2010


Hi Jay,

After much animated discussion with the rest of the engineering team, I am
forced to concede that what you and I /thought/ was a bug is actually the
correct behavior.

To quote Justin:

"In effect, the client keeps saying 'here is the very first version of this
object'.  The first time that happens, the object is simply stored.  The
second time, Riak detects that a sibling is being created due to conflicting
"first versions" and resolves them, storing what could be considered a
"second version" of the object as a result of the merge.  Subsequent times,
[the client continues to send the "first version" of the object,
which] logically precedes the already-stored "second version" and so the
incoming object is discarded."

This also explains why removing the client header "fixed" the problem.
Without a client header, Riak generated a new client-id for each store
operation. As a result, each store looked like a new sibling. And because
allow_mult is set to false by default, Riak automatically selected the
sibling with the latest timestamp.

(For more information on why Riak considers a store the first version, and
what is meant by "sibling", read Bryan's post on vector clocks:
http://blog.basho.com/2010/01/29/why-vector-clocks-are-easy)

So Riak is doing the right thing, even though--in this case--it is
admittedly a bit confusing at first glance.

With that in mind, it's best to stick to re-using an object across updates,
as I mentioned in a previous email, e.g.:

obj = client.store('b', 'k', {'foo':0})
obj["object"]["foo"] = 1
client.store('b', 'k', obj)
obj["object"]["foo"] = 2
client.store('b', 'k', obj)

Best,
Rusty

On Sun, Feb 7, 2010 at 9:15 AM, Rusty Klophaus <rusty at basho.com> wrote:

> Hi Jay,
>
>
>> That's because there's a bug in the header handling when using pycurl
>> which is causing my __riak_vclock__ to not get set.  I've attached a patch,
>> which also incidentally moves the response.reset() into the _pycurl_request
>> method, for improved encapsulation.
>>
>
> Good catch, and thank you for the patch!
>
>
>>
>>  In the end though, I still see an object with foo=1 rather than foo=2. I
>>> imagine what we're both seeing is that the foo=1 update is still running
>>> when the foo=2 update starts. The default quorum settings in the Python
>>> client _should_ prevent this, so I'm going to continue investigating, and
>>> I'll let you know what I find.
>>>
>>
>> Great!
>
>
> I found the root cause of this problem. It's a small fix, but it has the
> potential to affect other parts of the system, so I'm going to run it by the
> rest of the team before pushing out any changes. I'll follow up again in the
> next day or two.
>
> Best,
> Rusty
>
>
>>
>>
>>> On Sat, Feb 6, 2010 at 6:48 PM, Jay Doane <jay.s.doane at gmail.com> wrote:
>>> I'm using the new (raw) riak.py client library, and testing multiple
>>> updates to the same object with the following doctest file (also attached):
>>>
>>> $ cat riak_test.py
>>>
>>> """
>>> >>> import riak
>>> >>> client = riak.Riak('127.0.0.1', 8098)
>>> >>> client.delete('b', 'k')
>>> >>> client.store('b', 'k', {'foo':0})
>>> {'object': {u'foo': 0}}
>>> >>> client.store('b', 'k', {'foo':1})
>>> {'object': {u'foo': 1}}
>>> >>> client.store('b', 'k', {'foo':2})
>>> {'object': {u'foo': 2}}
>>> """
>>>
>>> import doctest
>>> doctest.testmod()
>>>
>>>
>>> When run against the 0.8 release, the final store operation always fails:
>>>
>>> $ python riak_test.py
>>> **********************************************************************
>>> File "riak_test.py", line 9, in __main__
>>> Failed example:
>>>   client.store('b', 'k', {'foo':2})
>>> Expected:
>>>   {'object': {u'foo': 2}}
>>> Got:
>>>   {'object': {u'foo': 1}}
>>> **********************************************************************
>>> 1 items had failures:
>>>  1 of   6 in __main__
>>> ***Test Failed*** 1 failures.
>>>
>>>
>>> Usually, that's all there is to the failure, but sometimes,
>>> unpredictably, I see the following on the riak console:
>>>
>>> =ERROR REPORT==== 6-Feb-2010::14:53:48 ===
>>> webmachine error: path="/raw/b/k"
>>> {error,
>>>   {error,function_clause,
>>>       [{raw_http_resource,select_doc,
>>>            [{ctx,<<"b">>,<<"k">>,
>>>                 {riak_client,'riak at 127.0.0.1',<<1,11,16,235>>},
>>>                 2,2,2,2,"raw",local,
>>>                 {error,notfound},
>>>                 undefined,undefined,[]}]},
>>>        {raw_http_resource,produce_doc_body,2},
>>>        {raw_http_resource,accept_doc_body,2},
>>>        {webmachine_resource,resource_call,3},
>>>        {webmachine_resource,do,3},
>>>        {webmachine_decision_core,resource_call,1},
>>>        {webmachine_decision_core,accept_helper,0},
>>>        {webmachine_decision_core,decision,1}]}}
>>>
>>>
>>> However, if I alter riak.py so that it doesn't send the X-Riak_ClientId
>>> header, the tests all pass.  Here's the trivial diff (also attached)
>>>
>>> diff -r c4486329e4af client_lib/riak.py
>>> --- a/client_lib/riak.py        Wed Feb 03 15:51:26 2010 -0500
>>> +++ b/client_lib/riak.py        Sat Feb 06 15:31:36 2010 -0800
>>> @@ -135,8 +135,7 @@
>>>
>>>    @expect(200)
>>>    def store(self, bucket, key, obj, links=[], w=2, dw=2):
>>> -        uphead = {'Content-Type': 'application/json',
>>> -                  'X-Riak-ClientId': self.clientid}
>>> +        uphead = {'Content-Type': 'application/json'}
>>>        try:
>>>            uphead['X-Riak-Vclock'] = obj['__riak_vclock__']
>>>        except KeyError:
>>>
>>>
>>> Any ideas what's going on?
>>>
>>> Thanks,
>>> Jay
>>>
>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100209/6ddfa35e/attachment.html>


More information about the riak-users mailing list