Riak Client Resources, Deleting a Key Doesn't Remove it from bucket.keys

Aphyr aphyr at aphyr.com
Thu May 26 12:52:13 EDT 2011


Agreed. In fact, jrecursive pointed out to me last week that vnode 
operations are synchronous. That means that when you call list-keys, not 
only is it going to take a long time (right now upwards of 5 minutes) to 
complete, but while each vnode is returning its list of keys *it blocks 
any other requests*.

While list-keys is an unfortunate necessity for some things, its use 
should be minimized if you're going to get to any appreciable (100M 
keys) scale. I don't even know how we're going to use it at all above a 
billion. Possibly by listing the keys periodically from bitcask 
directly, and maintaining an index ourselves.

--Kyle

On 05/26/2011 09:40 AM, Sean Cribbs wrote:
> With recent commits (
> https://github.com/seancribbs/ripple/compare/35d7323fb0e179c8c971...da3ab71a19d194c65a7b
> <https://github.com/seancribbs/ripple/compare/35d7323fb0e179c8c971..da3ab71a19d194c65a7b>
> ), it is cached until you either refresh it manually by passing :reload
> => true or a block (for streaming key lists). This was the compromise
> reached in that pull-request.
>
> All of this caching discussion glosses over the fact that you *should
> not list keys* in any real application. It really begs the question --
> how often do you list keys in Redis, or memcached? I suspect that
> generally you don't. This isn't a relational database. (Also, how often
> do you actually do a full-table scan in MySQL? You don't if you're sane
> -- you use an index, or even LIMIT + OFFSET.)
>
> I'm tempted to remove Document::all and make Bucket#keys harder to
> access, but the balance between discouraging bad behavior and exposing
> available functionality is a hard one to strike. I don't want new
> developers to immediately use list-keys and then be discouraged from
> using Riak because it's slow; on the other hand, it /can be useful/ in
> some circumstances. In those cases where it's useful, the developer
> should probably be responsible enough to request the key list only once;
> the caching behavior simply does this for them. I guess whether it
> /should/ do this for them is the issue at hand.
>
> All that said, I'm really torn on this issue, and the same problem
> applies to full-bucket MapReduce. Caveat emptor.
>
> Sean Cribbs <sean at basho.com <mailto:sean at basho.com>>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/
>
> On May 26, 2011, at 10:35 AM, Jonathan Langevin wrote:
>
>> How long is the key list cached like that, naturally?*
>>
>> <http://www.loomlearning.com/>
>> 	*/
>> /*Jonathan Langevin*/
>> Systems Administrator
>> *Loom Inc.*
>> Wilmington, NC: (910) 241-0433 - jlangevin at loomlearning.com
>> <mailto:jlangevin at loomlearning.com> - www.loomlearning.com
>> <http://www.loomlearning.com/> - Skype: intel352
>>
>> /*
>>
>> *
>>
>>
>> On Thu, May 26, 2011 at 10:35 AM, Sean Cribbs <sean at basho.com
>> <mailto:sean at basho.com>> wrote:
>>
>>     Keith,
>>
>>     There was a pull-request issue out for this on the Github project
>>     (https://github.com/seancribbs/ripple/pull/168). For various
>>     reasons, the list of keys is memoized in the Riak::Bucket
>>     instance. Passing :reload => true to the #keys method will cause
>>     it to refresh. I like to discourage list-keys, but with the
>>     memoized list you don't shoot yourself in the foot as often.
>>
>>     Sean Cribbs <sean at basho.com <mailto:sean at basho.com>>
>>     Developer Advocate
>>     Basho Technologies, Inc.
>>     http://basho.com/
>>
>>     On May 26, 2011, at 10:29 AM, Keith Bennett wrote:
>>
>>     > All -
>>     >
>>     > I just started working with Riak, and am using the riak-client
>>     Ruby gem.
>>     >
>>     > When I delete a key from a bucket, and try to fetch the value
>>     associated with that key, I get a 404 error (which is reasonable).
>>     However, it remains in the bucket's list of keys (i.e. the value
>>     returned by bucket.keys(). Why is the key still reported to exist
>>     in the bucket? Is bucket.keys cached, and therefore unaware of the
>>     deletion? Here's a riak-client Ruby script and its output in irb
>>     that illustrates this:
>>     >
>>     > ree-1.8.7-2010.02 :001 > require 'riak'
>>     > => true
>>     > ree-1.8.7-2010.02 :002 >
>>     > ree-1.8.7-2010.02 :003 > client = Riak::Client.new
>>     > => #<Riak::Client http://127.0.0.1:8098 <http://127.0.0.1:8098/>>
>>     > ree-1.8.7-2010.02 :004 > bucket = client['links']
>>     > => #<Riak::Bucket {links}>
>>     > ree-1.8.7-2010.02 :005 > key = bucket.keys.first
>>     > => "4000-17.xml"
>>     > ree-1.8.7-2010.02 :006 > object = bucket[key]
>>     > => #<Riak::RObject {links,4000-17.xml} [text/xml]:(6430 bytes)>
>>     > ree-1.8.7-2010.02 :007 > object.delete
>>     > => #<Riak::RObject {links,4000-17.xml} [text/xml]:(6430 bytes)>
>>     > ree-1.8.7-2010.02 :008 > bucket.keys.first
>>     > => "4000-17.xml"
>>     > ree-1.8.7-2010.02 :009 > object = bucket[key]
>>     > Riak::HTTPFailedRequest: Expected [200, 300] from Riak but
>>     received 404. not found
>>     >
>>     > from
>>     /Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:55:in
>>     `perform'
>>     > from
>>     /Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1054:in
>>     `request'
>>     > from
>>     /Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:2142:in
>>     `reading_body'
>>     > from
>>     /Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1053:in
>>     `request'
>>     > from
>>     /Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1037:in
>>     `request'
>>     > from
>>     /Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:543:in
>>     `start'
>>     > from
>>     /Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1035:in
>>     `request'
>>     > from
>>     /Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:47:in
>>     `perform'
>>     > from
>>     /Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:46:in
>>     `tap'
>>     > from
>>     /Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:46:in
>>     `perform'
>>     > from
>>     /Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/http_backend/transport_methods.rb:59:in
>>     `get'
>>     > from
>>     /Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/http_backend.rb:72:in
>>     `fetch_object'
>>     > from
>>     /Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/bucket.rb:101:in
>>     `[]'
>>     > from riak-delete-failure.rb:9
>>     >
>>     > Thanks,
>>     > Keith
>>     >
>>     >
>>     >
>>     > _______________________________________________
>>     > riak-users mailing list
>>     > riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>>     >
>>     http://lists.basho.com/mailman/listinfo/riak-users_listsbasho.com
>>     <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>>
>>
>>     _______________________________________________
>>     riak-users mailing list
>>     riak-users at lists.basho.com <mailto:riak-users at lists.basho.com>
>>     http://lists.basho.com/mailman/listinfo/riak-users_listsbasho.com
>>     <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>>
>>
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




More information about the riak-users mailing list