update document

francisco treacy francisco.treacy at gmail.com
Thu Nov 5 14:18:02 EST 2009


Thank you very much for your detailed answers.

Paul: returnbody=true  actually was the trick!
Bryan: Regarding my surprise (why it worked a few seconds after) your
explanation was great... and now yes it makes sense.

I have released the Scala library for Riak, and I've actually got a
couple of questions/ issues:

1. find_all => what would be the natural approach to implement this function?

Looks like right now I have to GET /bucket/ , grab the keys array, and
do a single GET per key. Or am I missing something?

2. vclocks => sometimes extremely long (vastly outweighing the object
itself), or those were edge cases in my test scenarios?

3. I usually lose connection to Riak overnight - but today it happened
once, while I was working. (I use config/riak-demo.erlenv and version
from a couple of days ago with attachment support via HTTP).
 It won't connect, but if I try to ./start-fresh without killing
related processes first, it will start throwing strange 500-something
server errors, no matter what request.

4. A pain in the butt to kill (I use config/riak-demo.erlenv)
I know about "killall heart", but sometimes I need to issue that
command twice and still then ps for erlang processes, having to kill
epmd as well. Failing to do so upon ./start-fresh will bring the
aforementioned 500 server errors.

Well, thanks again!

Francisco



2009/11/3 Bryan Fink <bryan at basho.com>:
> On Mon, Nov 2, 2009 at 11:58 AM, francisco treacy
> <francisco.treacy at gmail.com> wrote:
>> I am developing a Scala library for Riak while I learn more about this
>> datastore.
>>
>> I have covered storing/fetching documents, so far so good. But when I
>> try to 'update' a document I am noticing a behaviour I didn't expect:
>> As an example, when I execute this Ruby code:
>>
>> client = JiakClient.new('localhost', 8098)
>> b = {'key' => "key", 'bucket' => "test", :links => [], 'object' => {
>> :my => "json2" }}
>> c = {'key' => "key", 'bucket' => "test", :links => [], 'object' => {
>> :my => "json3" }}
>> client.store b
>> client.store c
>> r = client.fetch 'test', 'key'
>> puts r['object']['my']
>>
>> the output is always "json2", where I would normally expect "json3".
>>
>> However, immediately after I do:
>> curl -X PUT http://localhost:8098/jiak/test/key -H "Content-Type:
>> application/json" --data "{\"bucket\":\"test\", \"key\":\"key\",
>> \"object\":{\"my\":\"json4\"}, \"links\":[]}"
>> curl http://localhost:8098/test/key
>>
>> ...and the result is "json4", which seems fine. (If I execute the Ruby
>> code again, I get "json2").
>>
>> So I guess my question is... what is going on here?  Why doesn't it
>> store the object with "json3"?
>> Looks like it can't cope with subsequent updates, but is that tied to
>> the fact of having vclocks or something to do with the N/R/W values?
>
> Hi, Francisco.  Indeed, this does have to do with vclocks.  Put
> simply, because 'c' doesn't contain a vclock, Riak can't tell that it
> *is* a subsequent update.
>
> When Riak can't tell (via vclocks) that a write descends from the
> value that's already in place, it stores the new value as a "sibling"
> to the existing value, instead of overwriting it.  At read time, if
> the 'allow_mult' bucket property is set to 'false', Riak will choose
> one of these sibling values, instead of handing them all to you.  If
> 'allow_mult' is set to true, both values are given to the Jiak layer
> for merging.
>
> In either the allow_mult=false or the default-Jiak-merge case, an
> attempt is made to choose the "latest" value by comparing the
> last-modified-time of each value.  Unfortunately for your example
> case, last-modified-time only has second resolution, so there's a
> pretty good chance that the earlier value will be chosen if the writes
> happened very close together (i.e. in the same second).
>
> This is also the reason that your third PUT, in curl-command form,
> "overwrote" the old value.  It was at least a second later, so the
> timestamp made it obvious which value to choose.  If you were to pause
> between the Ruby client.store calls, you wouldn't see the issue.
>
> The real fix is to not build your second write without a vclock.  What
> you really want is:
>
> client = JiakClient.new('localhost', 8098)
> b = {'key' => "key", 'bucket' => "test", :links => [], 'object' => {
> :my => "json2" }}
> c = client.store b
> c['object']['my'] = "json3"
> client.store c
> r = client.fetch 'test', 'key'
> puts r['object']['my']
>
> That code should print out "json3" every time.  The value of 'c' will
> include a vclock telling Riak that this value descends from the old
> value, and should therefore replace it.  No confusion with
> last-modified-time necessary.
>
> The behavior of 'allow_mult' will change slightly in the next release
> of Riak, making this simple case less surprising, but that won't help
> you in more complex, live-system, distributed cases.  Only proper
> vclock management can help you there.
>
> -Bryan
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>




More information about the riak-users mailing list