Key Filter Timeout

Mark Steele msteele at beringmedia.com
Mon Oct 24 10:51:37 EDT 2011


It was a pretty simple benchmark test using a custom built protocol buffer client, so I wouldn't put too much faith in it.

As far as performance, my client was able to retrieve keys at a rate of about 120 thousand keys per second from the key listing operation. The key listing performance was constant, my testing went from 1 million keys stored to 60 million with very little variation in throughput. 

I guess the mantra is: Test test test test with your app, YMMV

Cheers,

Mark


On Monday 24 October 2011 07:37:47 Jim Adler wrote:
> Yes, using 1.0.1 with LevelDB. I moved to it from Bitcask in the hopes of better performance.
> 
> Good to hear about your 60M key use-case. Can you share any key access performance numbers?
> 
> Jim
> 
> On Oct 24, 2011, at 7:23 AM, Mark Steele <msteele at beringmedia.com> wrote:
> 
> > Just curious Kyle, you using the 1.0 series?
> > 
> > I've done some informal testing on a 3 node 1.0 cluster and key listing was working just peachy on 60 million keys using bitcask as the backend.
> > 
> > Cheers,
> > 
> > Mark
> > 
> > On Sunday 23 October 2011 12:26:35 Aphyr wrote:
> >> On 10/23/2011 12:11 PM, Jim Adler wrote:
> >>> I will be loosening the key filter criterion after I get the basics
> >>> working, which I thought would be a simple equality check. 8M keys
> >>> isn't really a large data set, is it? I thought that keys were stored
> >>> in memory and key filters just operated on those memory keys and not
> >>> data.
> >>> 
> >>> Jim
> >> 
> >> That's about where we started seeing timeouts in list-keys. Around 25
> >> million keys, list-keys started to take down the cluster. (6 nodes, 1024
> >> partitions). You may not encounter these problems, but were I in your
> >> position and planning to grow... I would prepare to stop using key
> >> filters, bucket listing, and key listing early.
> >> 
> >> Our current strategy is to store the keys in Redis, and synchronize them
> >> with post-commit hooks and a process that reads over bitcask. With
> >> ionice 3, it's fairly low-impact. https://github.com/aphyr/bitcask-ruby
> >> may be useful.
> >> 
> >> --Kyle
> >> 
> >>   # Simplified code, extracted from our bitcask scanner:
> >>   def run
> >>     `renice 10 #{Process.pid}`
> >>     `ionice -c 3 -p #{Process.pid}`
> >> 
> >>       begin
> >>         bitcasks_dir = '/var/lib/riak/bitcask'
> >>         dirs = Dir.entries(bitcasks_dir).select do |dir|
> >>           dir =~ /^\d+$/
> >>         end.map do |dir|
> >>           File.join(bitcasks_dir, dir)
> >>         end
> >> 
> >>         dirs.each do |dir|
> >>           scan dir
> >>           GC.start
> >>         end
> >>         log.info "Completed run"
> >>       rescue => e
> >>         log.error "#{e}\n#{e.backtrace.join "\n"}"
> >>         sleep 10
> >>       end
> >>     end
> >>   end
> >> 
> >>   def scan(dir)
> >>     log.info "Loading #{dir}"
> >>     b = Bitcask.new dir
> >>     b.load
> >> 
> >>     log.info "Updating #{dir}"
> >>     b.keydir.each do |key, e|
> >>       bucket, key = BERT.decode(key).map { |x|
> >>         Rack::Utils.unescape x
> >>       }
> >>       # Handle determines what to do with this particular bucket/key
> >>       # combo; e.g. insert into redis.
> >>       handle bucket, key, e
> >>     end
> >>   end
> >> 
> >> _______________________________________________
> >> riak-users mailing list
> >> riak-users at lists.basho.com
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



More information about the riak-users mailing list