A script to check bitcask keydir sizes

Justin Sheehy justin at basho.com
Thu Mar 24 13:49:06 EDT 2011


Hi, Greg.

On Thu, Mar 24, 2011 at 10:17 AM, Greg Nelson <grourk at dropcam.com> wrote:
> Wouldn't it be the common case that
> there are relatively few buckets?  And so wouldn't it save a lot of memory
> to keep a reference to an interned bucket name string in each entry, instead
> of the whole bucket name?

One reason this isn't done is that bitcask is an independent
application, used-by rather than part-of Riak.  It's just a local kv
store, and knows nothing of higher-level concepts like buckets.
Another reason is that there are also users with very many buckets in
use, a situation that makes the proposed solution uncomfortable.

In cases where there are truly few buckets and one knows it would stay
that way, one could plausibly modify riak_kv_bitcask_backend (the part
of Riak that talks to Bitcask) to use a bitcask per bucket on each
vnode instead of a single bitcask per vnode.  One downside of that
approach would be that if the number of buckets did grow then the file
descriptor consumption would be large and the node-wide I/O profile
might be much worse as well.

Everything has tradeoffs.

-Justin




More information about the riak-users mailing list