Bitcask Key Listing

Kelly McLaughlin kelly at
Tue Aug 19 11:26:36 EDT 2014


There are two aspects to to a key listing operation that make it 
expensive relative to normal gets or puts.

The first part is that, due to the way data is distributed in Riak, key 
listing requires a covering set of vnodes to participate in
order to determine the list of keys for a bucket. A minimal covering set 
of vnodes works out to 1/N nodes in the cluster where N
is the n_val of the bucket. By default this is 3 so in the default case 
a key listing request must send a request to and receive
responses from 1/3 of the nodes in the cluster. This incurs network 
traversal overhead as the keys from each vnode are returned
and the speed to completion is limited by the slowest vnode in the 
covering set. This is true regardless of the backend in use.

The second part is specific to bitcask. Bitcask is an unordered backend 
and the consequence of this when doing a key listing is
that all of the keys stored by a vnode that participates in a key 
listing request must be scanned. It doesn't matter if there are
2 keys or 2000 keys for the bucket being queried, they all must be 
scanned. This is a case where all the keys being stored in memory
is beneficial to performance, but as the amount of data stored increases 
so does the expense to scan over it. The leveldb backend is
ordered and we are able to take advantage of that fact to only scan over 
data for the bucket in question, but for bitcask that is
not an option.

At this time there is nothing in the works to specifically improve key 
listing performance. It is certainly something we are aware of,
but at this time there are other things with higher priority.

Hope that helps answer your question.


On 08/19/2014 05:17 AM, Jaston Campbell wrote:
> I currently maintain my own indexes for some things, and use natural keys where I can, but a question has been nagging me lately.
> Why is key listing slow?  Specifically, why is bitcask key listing slow?
> One of the biggest issues with bitcask is all keys (including the bucket name and some overhead) must fit into RAM.  For large amounts of keys, I understand the coordination data transfer will hurt, but shouldn't things like list buckets (or listing keys from small buckets) be fast?
> Is there a reason this is slow, and is there a plan to fix it?
> Thanks,
> Jason
> _______________________________________________
> riak-users mailing list
> riak-users at

More information about the riak-users mailing list